Category "assembly"

What do the `uxtx` and `sxtx` extensions mean for 32-bit AArch64 `adds` instruction?

I'm looking at the following disassembled AArch64 instruction: 65 6E 20 2B adds w5, w19, w0, uxtx #3 According to the ARM manual, uxtx zero-extends w0 to an

How to create a text file that uses user input from Irvine library to write onto it

How do I create a text file that updates values inputted to by the user using Irvine's library? for example for my data I have: .data frstValue BYTE "Enter f

How to remove "noise" from GCC/clang assembly output?

I want to inspect the assembly output of applying boost::variant in my code in order to see which intermediate calls are optimized away. When I compile the fol

Integrating x86 assembly in C++ project for old MS-DOS system information program

I'm new to C++ programming and have always wanted to write a system information program for MS-DOS. I'm currently using the latest DigiMars C++ compiler and MA

Are RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, R8-R15 interchangable?

Are x64 registers interchangable, in the sense that any instruction that works with one combination of them will work with any other? Is there performance diffe

How do you understand 'REX.W + B8+ rd io' form for x86-64 assembly?

I was originally trying to generate the bytes for an immediate move into a 64 bit register.The specific operation I wanted was mov rdi, 0x1337 Using https://ww

Null bytes in shellcode? Why does mov eax,1 machine code have bytes that are 00?

Going through the shellcode article on wikipedia, it gives an example as follows: B8 01000000 MOV EAX,1 // Set the register EAX to 0x000000001 To

relocation truncated to fit r_386_8 against .bss'

When I try to build my source into a 32-bit static executable for Linux with nasm -f elf -F dwarf -g loop.asm ld -m elf_i386 -o loop loop.o I get this R_386_

Can modern x86 implementations store-forward from more than one prior store?

In the case that a load overlaps two earlier stores (and the load is not fully contained in the oldest store), can modern Intel or AMD x86 implementations forwa

Why is a conditional move not vulnerable to Branch Prediction Failure?

After reading this post (answer on StackOverflow) (at the optimization section), I was wondering why conditional moves are not vulnerable for Branch Prediction

Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures?

I'm doing micro-optimization on a performance critical part of my code and came across the sequence of instructions (in AT&T syntax): add %rax, %rbx mov %r

Generating 1 random number within 0-256 range in x86 8086 tasm(16 bit) [duplicate]

At this point i have been learning assembly for about 6 months. My current project is a random number generator. I need to generate 1 random n

Converting ASCII hex number to 32-bit binary integer in x86

So im reading the user's 8-digit input, and saving it into a variable. for example: Enter an 8-digit hex number: 1ABC5678 So, then i loop through the 1ABC5678 h

What is the difference, if any, between LONG and FAR jumps in Assembly?

I'm looking at some practice code for assembly, and the assignment is basically to replace one jump point with another. The original jmp is a SHORT jmp, and th

Can x86's MOV really be "free"? Why can't I reproduce this at all?

I keep seeing people claim that the MOV instruction can be free in x86, because of register renaming. For the life of me, I can't verify this in a single tes

Printing hex from dx with nasm

I actually want to print the content of the dx register with nasm. Thereby the content is a 16 bit hex digit such as 0x12AB. Therefore I've first implemented a

Printing hex from dx with nasm

I actually want to print the content of the dx register with nasm. Thereby the content is a 16 bit hex digit such as 0x12AB. Therefore I've first implemented a

Adding 2D arrays in Assembly (x86)

I have to add two 3*3 arrays of words and store the result in another array. Here is my code: .data a1 WORD 1,2,3 WORD 4,2,3 WORD 1,4,3 a2 WORD 4, 3, 8

Understanding bubble vs stall vs repeated decode/fetch

I'm really confused on the difference between bubbles, stalls, and repeated decoding/fetching. My text is the Patterson text, 3rd edition. Example 1: add $3,

Micro fusion and addressing modes

I have found something unexpected (to me) using the Intel® Architecture Code Analyzer (IACA). The following instruction using [base+index] addressing add