July Saturday 21

Bookmark and Share

Related Articles...

MILWAUKEE WI Arduino Meet Up Make Space


WANTED: Angle Investor Start-Up Funding

Large Industrial Zoned Arduino Make Space Acquired!

Email :

COASMailBox @ gmail.com

MILWAUKEE WI Raspberry Pi Meet Up Make Space


WANTED: Angle Investor Start-Up Funding

Large Industrial Zoned Raspberry Pi Make Space Acquired!

Email :

COASMailBox @ gmail.com

MILWAUKEE WI Internet of Things (IoT) Meet Up Make Space


WANTED: Angle Investor Start-Up Funding

Large Industrial Zoned Make Space Acquired!

Full Advanced Fabrication Facility, machine Shop, Welding, Electronics Lab Equipment Acquired!

Email :

COASMailBox @ gmail.com

MILWAUKEE WI Machine Repair, & Installation  - Tier 1 Business Support

Machine Technician Mechanic 20+ Years  Personal Experienced In Industrial Factory Machine Servicing, Rebuilding, Installation And Repair - Plus A 35+ Years Experienced Computer Engineer Service Technician Team Ready For Your Outsourced Market Advantage Needs!

Hartung Industrial - Member - CGI Science & Technology Group

Email :

COASMailBox @ gmail.com


Arduino Open Source Community member - Tier 1 Full Stack Developer

Arduino Assembly Language: AVR Assembler

2018 JANUARY | by Gene Casanova

Senior Systems Engineer

How To Write Assembly Language Programs For The Atmega328p Microcontroller & The Arduino


  1. Learn to write assembly language source-code, to initialize an AVR ATmega328P MPU and have it run a program.
  2. Understand the assembly language source-code
  3. Using the Atmel Studio 7 IDE source-code development platform.
  4. Learn to upload assembly source-code to an 'Arduino UNO' development board, directly from the Atmel Studio 7 IDE application.

The above steps is a massive task to produce.... the complete guide is accessible through a paid subscription.  Free information is worth free... paid information, is for the more serious person, seeking serious knowledge and and valued skills.

Assembly computer programming language, abbreviated "assembly language" and "asm", is a low-level computer-programming language, used to make a machine-code program for a specific central processing unit (CPU) IC chip.  There are many assembly languages; each made for a specific CPU architecture.

The Atmega328p Microcontroller

The AVR MCU brand, represents a dominant architecture in the embedded design universe in 2018.  This information can be used with the other AVR microcontroller models offered by Microchip, the current manufacturer and owner of the "AVR" brand.  With a combined 45 years' experience developing commercially available and cost-effective 8-bit MCUs, Microchip has become the supplier of choice for many; due to its strong legacy and history of innovation in 8-bit.

Details of the ATmega328P-PU DIP IC microcontroller are published online:



The Atmel® picoPower® ATmega328/P is a low-power CMOS 8-bit microcontroller based on the AVR® enhanced RISC architecture.

The high-performance Microchip picoPower 8-bit AVR RISC-based microcontroller, combines 32KB ISP flash memory with read-while-write capabilities, 1024B EEPROM, 2KB SRAM, 23 general purpose I/O lines, 32 general purpose working registers, three flexible timer/counters with compare modes, internal and external interrupts, serial programmable USART, a byte-oriented 2-wire serial interface, SPI serial port, a 6-channel 10-bit A/D converter (8-channels in TQFP and QFN/MLF packages), programmable watchdog timer with internal oscillator, and five software selectable power saving modes. The device operates between 1.8-5.5 volts.

By executing powerful instructions in a single clock cycle, this MPU achieves throughputs approaching 1 MIPS per MHz, balancing power consumption and processing speed = Powerful!  131 Powerful Instructions available.

The Software Developing Platform

All AVR® microcontrollers require some software uploading.  To create and debug this software, you can use an integrated development environment (IDE), such as Atmel Studio produced by Microchip.  This IDE contains everything needed to create, compile and debug source-code, and, empowers the user to upload source- code directly into the on-chip Flash of a AVR MCU; without any other software components.

Atmel Studio 7 is made available free of charge from the Microchip company website.  Atmel Studio 7 Download page.  This application requires a computer running a MS-Windows desktop environment.

Assembly (Asm) Instructions Used

Review the AVR datasheet Instruction Set Summary.  This guide will be using following instructions: cbi, ldi, out, dec, adwi, brne, rcall, ret, rjmp and sbi, to create an Arduion UNO program, to have it flash a LED on and off continously.

The Assembly computer-programming language instructions (assembly instructions) used in this guide:

ldi 1

Load Immediate Into; Loads an 8-bit constant, directly to registers 16 to 31.

cbi 1

Clear Bit In I/O Register;  Clears a specified bit in an I/O register.

sbi 1  
out 1  
dec 1

Decrement — Subtracts one from the contents of register Rd and places the result in the destination register Rd.

adiw 2

Add Immediate to Word — Adds an immediate value (0–63) to a register pair and places the result in the register pair.

brne 2

Branch If Not Equal;  Conditional relative branch. Tests the Zero Flag (Z) and branches relatively to PC if Z is cleared.

rcall 1

Relative Call to Subroutine;  Relative call to an address within PC

ret 1

Return from Subroutine;  Returns from subroutine.

rjmp 1

Relative Jump ;  Relative jump to an address.

Open ATmel Studio 7

Download and install ATmel Studio 7.

Open the application and get beyond the "start page".

Look to the main menu, and click 'Atmel Studio Solution' > 'Blank Solution' from installed tab.  Give the new project a name.  Example name 'AVR1'.  Next, determine and define the directory for this project; where related files will be stored.

On 'Solution Explorer', right click and choose 'Add' > 'New Project', 'Assembler' > 'AVR Assembler Project', and on 'Device Selection', type 328pand choose 'ATmega328P' and give it a name.

At the beginning of a new project, put a 'Block Comment Header' (BCH) in first; providing detailed reference to the purpose of the assembly source code.  This is wise and a good computer-programming practice.

/*------------- Name -----
 * Function
 * Purpose
 * Parameters

Open a Simulation Project.

Open a Target Project.

Init the source code.

Here is an example assembly source code to create a very basic AVR program:

/*------------- Name -----
 * Function
 * Purpose
 * Parameters
.ORG 0x000  ; The next instruction has to be written to add 0x0000
; Infinite loop:
START: ; This is a label "START".
rjmp START ; Relative Jump to "START".

When the instructions above is executed, the CPU (ALU — Arithmetic Logic Unit) will "jump" to "START".  The "jump" will be repeated over and over, resulting in an infinite loop.

An Arduino UNO AVR runs at 16 MHz (16 Million clock cycles per second).

A LED turning On and Off at 16MHz, cannot be detected by the human eye.

The video display industry claims the average human eye, cannot detect a difference in a picture-frame presented less than 16ms to 13ms (0.016-0.013 seconds).  A frame rate frequency is typically between 60Hz to 80Hz (about 0.06 & 0.08 MHz).
Calculate how many AVR cycles it will take to turn On 1/2 second and Off  1/2 second, a LED, creating a 1/2 second blink, at a frequency of 16MHz:

If 1 cycle completes in 0.0000000625 of a second.
x cycles to make 0.5 second.
Calculate 0.5 LED interval / 0.0000000625 cycle-per-second = 8,000,0000 cycles to create a /5 second blink.

To achieve these cycles, two loops are implemented; an inner and outter loop.

The inner loop is treated like one BIG instruction needing 262145 clock cycles.  Registers can be used in pairs, enabling working with values 0 to 65535.

The following source-code clears registers 24 and 25, and increments them in a loop until they overflow to zero.

clr r24             ; clr uses 1 cycle.
clr r25 ; clr uses 1 cycle.
DELAY_05:           ; 0.5 second delay.
adiw r24, 1 ; adiw needs 2 cycles and
brne DELAY_05 ; brne needs 2 cycles if the branch is done, otherwise 1.

Every time the registers do not overflow, the loop takes adiw (2 cycles) + brne (2 cycles) = 4 cycles.

This loop is done 0xFFFF (65535) times, before the overflow occurs.  The next time, the loop only needs 3 cycles, because no branch is done.

This adds up to 4 * 65535(loops) + 3(overflow) + 2(clr) = 262145 cycles.  This is still not enough cycles: 8000000 / 262145 ~ 30.51.

The "outer" loop will be down-counting from 31 to zero, using R16.

ldi r16, 31
OUTER_LOOP: ; outer loop label
ldi r24, 0 ; clear register 24
ldi r25, 0 ; clear register 25
DELAY_05: ; the loop label
adiw r24, 1 ; "add immediate to word": r24:r25 are
; incremented
brne DELAY_05
dec r16 ; decrement r16
brne OUTER_LOOP ; load r16 with 8

The overall loop needs: 262145 (inner loops) + 1 (dec) + 2 (brne) = 262148 * 31 = 8126588 cycles.

The result is 8126588 - 8000000 = 126588 cycles too long.

Change the initial value of r24 : r25.

The outer loop is executed 31 times and includes the "big-inner-loop-instruction".

Subtract cycles from the inner loop: 126588 / 31 = 4083 cycles per inner loop shorter.

Every iteration of the inner loop, takes 4 cycles (the last one takes 3, but not important), divide 4083 by 4 = 1020.8 or 1021 less iterations.

This is the new initialisation value for r24:r25.

Do all the cycle calculations witht he new numbers and the result is 8000000 clock cycles.

Put this into a seperate routine, and run it from the main LED flashing loop.  The complete assembly source code:

.org 0x0000            ; the next instruction has to be written to address 0x0000
 rjmp START            ; the reset vector: jump to "main"
 ldi r16, low(RAMEND)   ; set up the stack
 out SPL, r16
 ldi r16, high(RAMEND)
 out SPH, r16
 ldi r16, 0xFF          ; load register 16 with 0xFF (all bits 1)
 out DDRB, r16          ; write the value in r16 (0xFF) to Register B.

  sbi PortB, 5         ; switch off LED
  rcall delay_05       ; wait for half a second
  cbi PortB, 5         ; switch it on
  rcall delay_05       ; wait for half a secon
  rjmp LOOP            ; jump to loop

; the subroutine:
  ldi r16, 31          ; load r16 with 31

OUTER_LOOP:            ; outer loop label
  ldi r24, low(1021)   ; load registers r24:r25 with 1021, new init value.
  ldi r25, high(1021)  ; the loop label

DELAY_LOOP:            ; "add immediate to word": r24:r25 are
                       ; incremented
  adiw r24, 1          ; if no overflow ("branch if not equal"), go
                       ; back to "delay_loop"
  dec r16              ; Decrement r16
  brne OUTER_LOOP      ; Loop if 'outer_loop' not finished.
  ret                  ; return from subroutine.

Run Simulation

Look to menu and click 'Build' > 'Build Solution' and proceed clicking on Start Debugging and Break (Alt + F5) button of 'Debug' menu.

Error.  Please select a connected tools and interface and try it again...  Click on 'Continue' and in 'Tool' > 'Selected debugger / programmer' choose 'Simulator'.

Take advantage and go to 'Toolchain' and under 'General' disable 'Generate HEX file'. This is a security feature.  This is a simulation and not a program to be put on a MPU chip.

Crtl + S, and dismiss the Properties window and Alt + F5 again.

The debugger starts and 'break' means, to pause in the first line (or whatever you wish).  Select in 'I/O' window 'PORTB'.

Notice a yellow arrow stops right at the address zero.  It points to the instruction to be executed next (pressing 'Reset' brings you here also), and not yet executed.

To step through the source-code line-by-line, there are basically two key commands you must be aware of: 'Step Over' (F10) and 'Step Into' (F11).
Pressing F11 will move to the next instruction.  The counter program starts counting.  A machine cycle is passed.
In Arduino frequency rate it takes 0.06 microseconds.
In the next step, notice the R16 register value will change.  It will be '0xFF' the value of the first nibble out of the Stack pointer.

These operations enable using 16-bits (one word) to do the counting required.

Next, configure DDRB with 0xFF all bits are 1, as OUTPUT mode.

Next, set 'PORTB' up and down alternately using sbi to set bit and cbi to clear it.

Use rcall to jump to loop delay_05 (half second). For this press F10 (Step Over).

After seconds the clock will stop in the cycle of number 8 million according to the calculations above.

Calculate the Stop Watch; exactly half a second.

500000000 µs = 0.5 second

Stop debugging by clicking on 'Stop debugging' button (Ctrl+Shift+F5).

Run On Real Arduino UNO

Copy and paste this code to target project.

Go to 'Tools' menu, and click "Send To ArduinoUNO".

Save Project As A Template

Use the source code of the simulation as a template. 

Copy and paste the following template block comment header to the file "main.asm".

| Project: TITLE HERE
| Author: NAME HERE
| Language: The Computer Programming Language this source code was written in. | The name of the compilier use to compile when tested.
| Solution: The Solution Name used with ATMEL STUDIO 7
| Projects: _##_####_Simulation_ATMEGA328P &
| _##_####_Target_ATMEGA328P
| To Compile: Explain How To Compile This Source Code.
+ -----------------------------------------------------------------
*================================================================*/ .INCLUDE "m328pdef.inc" ; This is automatically included.
.ORG 0x0000 ; initial instruction.
rjmp START ; the reset vector: jump to "main".
;***place data here***

;***source code here***
;***loop routine***
rjmp LOOP
;***procedure here***
.EXIT ; End of Source Code; assembler process exit here.


Now click 'File' > 'Export Template…'.
Choose 'Template Type', click 'Project Template'.

Click "Next".

Enter the following information:

Choose Template name (first_template_328P).

Template description, enter a brief and clear description (simulation/target project template for Arduino Uno.

Output location (new dirrectory).

Leave everything in the default setting.

A ZIP file will be produced.



Repeat the same procedure to save the project.  Name it as "target_template_328P".

For testing, reboot  the 'Atmel Studio 7 IDE' application, and move to 'File' > 'New' > 'Project'.

Save a Target project as Template, repeat the same steps above for target project.

Use The Template

Select 'File' > 'Import' > 'Project Template', and select the new .zip file of the project created above.

Use The Technology Wisely & Keep It Simple

- Cheers!

Gene Casanova

Need More?   Need Help? ........Software, Network, System & Data Center Builder & Developer | Development Services Available - Freelance Small Jobs; To Outsourced Long Term Service Provider Contract Labor Available.

CGI Computer Wares | EST 1979

Send E-MailCONTACT: [click]@CGIComputerWares.com