Skip to content

Latest commit

 

History

History
346 lines (251 loc) · 17.3 KB

README.md

File metadata and controls

346 lines (251 loc) · 17.3 KB

Atomic operation overview by architecture

This directory contains architecture-specific atomic implementations.

This document describes the operations that are considered atomic by architecture.

AArch64

target_arch: aarch64, arm64ec
Implementation: aarch64.rs

TODO

Arm

target_arch: arm
Implementation: arm.rs, armv8.rs, arm_linux.rs

TODO

AVR

target_arch: avr
Implementation: avr.rs
Refs: AVR® Instruction Set Manual, Rev. DS40002198B

This architecture is always single-core and the following operations are atomic:

  • Operation that is complete within a single instruction.
    This is because the currently executing instruction must be completed before entering the interrupt service routine.
    (Refs: AVR® Interrupts)
    The following two kinds of instructions are related to memory access:

    • 8-bit load/store
    • XCH, LAC, LAS, LAT: 8-bit swap,fetch-and-{clear,or,xor} (xmegau family)
  • Operations performed in a situation where all interrupts are disabled.
    However, pure operations that are not affected by compiler fences (note: the correct interrupt disabling and restoring implementation must imply compiler fences, e.g., asm without nomem/readonly) may be moved out of the critical section by compiler optimizations.

Hexagon

target_arch: hexagon
Implementation: hexagon.rs

TODO

LoongArch

target_arch: loongarch64
Implementation: loongarch.rs

TODO

M68k

target_arch: m68k
Implementation: m68k.rs
Refs: M68000 FAMILY Programmer's Reference Manual

The following instructions are atomic if the address is properly aligned and the specified storage meets the requirements:

  • Load/Store Instructions

    • {8,16,32}-bit
  • Multiprocessor Instructions

    • TAS: 8-bit TAS (M68000 or later)
    • CAS: {8,16,32}-bit CAS (M68020 or later)
    • CAS2: {16,32}-bit double CAS (M68020 or later)

    (Refs: Section 3.1.11 "Multiprocessor Instructions" of M68000 FAMILY Programmer's Reference Manual)

Note that CAS2 is not yet supported in LLVM (as of 19).

MIPS

target_arch: mips, mips32r6, mips64, mips64r6
Implementation: mips.rs
Refs: The MIPS32® Instruction Set Manual, Revision 6.06 (MD00086), The MIPS64® Instruction Set Reference Manual, Revision 6.06 (MD00087), MIPS® Coherence Protocol Specification, Revision 01.01 (MD00605)

The following instructions are atomic if the address is properly aligned and the specified storage meets the requirements:

  • Load/Store Instructions

    • {8,16,32}-bit and 64-bit (MIPS64 only)
  • LoadLinked and StoreConditional Instructions (LL/SC)

    • LL/SC: 32-bit LL/SC (MIPS32 or later)
    • LLD/SCD: 64-bit LL/SC (MIPS64 or later)
    • LLE/SCE: 32-bit LL/SC (MIPS32 or later, only present if Config5EVA=1)
    • LLWP/SCWP: 64-bit LL/SC (MIPS32R6 or later, only present if Config5XNP=0)
    • LLWPE/SCWPE: 64-bit LL/SC (MIPS32R6 or later, only present if Config5XNP=0 and Config5EVA=1)
    • LLDP/SCDP: 128-bit LL/SC (MIPS64R6 or later, only present if Config5XNP is 0)

Note that LL{W,D}P{,E}/SC{W,D}P{,E} is not yet supported in LLVM (as of 19).

None of the above instructions imply a memory barrier. Several types of memory barriers are provided by SYNC instruction, but only SYNC (SYNC 0) is mandatory.
(Refs: Section 4.6.2 "Memory Barriers" and section 4.6.3 "Implicit Memory Barriers" of MIPS® Coherence Protocol Specification)

MSP430

target_arch: msp430
Implementation: msp430.rs
Refs: MSP430x5xx and MSP430x6xx Family User's Guide, Rev. Q

This architecture is always single-core and the following operations are atomic:

  • Operation that is complete within a single instruction.
    This is because the currently executing instruction must be completed before entering the interrupt service routine.
    (Refs: Section 1.3.4.1 "Interrupt Acceptance" of MSP430x5xx and MSP430x6xx Family User's Guide, Rev. Q)

  • Operations performed in a situation where all interrupts are disabled.
    However, pure operations that are not affected by compiler fences (note: the correct interrupt disabling and restoring implementation must imply compiler fences, e.g., asm without nomem/readonly) may be moved out of the critical section by compiler optimizations.

PowerPC

target_arch: powerpc, powerpc64
Implementation: powerpc.rs
Refs: Power ISA (3.1C, 2.07B)

The following instructions are atomic if the address is properly aligned and the specified storage meets the requirements:

  • Load/Store Instructions

    • All {8,16,32}-bit and 64-bit (powerpc64-only) single load/store instructions other than Move Assist instruction
    • lq/stq: 128-bit load/store (powerpc64-only)
      Compatibility: ISA 2.07 or later (available since ISA 2.03, but were privileged instructions and big-endian mode only and no documented atomicity guarantee, in pre-2.07 ISA)
      • ISA 2.07B: included in the requirements of server processors as Load/Store Quadword category
      • ISA 3.1C: included in the Linux Compliancy subset and AIX Compliancy subset
    • plq/pstq: 128-bit load/store (powerpc64-only)
      (Note: Not mentioned in "Single-Copy Atomicity" section, but GCC uses them for 128-bit load/store)
      Compatibility: ISA 3.1 or later
      • ISA 3.1C: included in the Linux Compliancy subset and AIX Compliancy subset

    (Refs: Section 1.4 "Single-Copy Atomicity" of Power ISA 3.1C Book II)

  • Load And Reserve and Store Conditional Instructions (aka LL/SC)

    • l{b,h}arx/st{b,h}cx.: {8,16}-bit LL/SC
      Compatibility: ISA 2.06 or later
      • ISA 2.07B: included in the requirements as Base category
      • ISA 3.1C: included in all compliancy subsets
    • lwarx/stwcx.: 32-bit LL/SC
      Compatibility: PPC or later
      • ISA 2.07B: included in the requirements as Base category
      • ISA 3.1C: included in all compliancy subsets
    • ldarx/stdcx.: 64-bit LL/SC (powerpc64-only)
      Compatibility: PPC or later
      • ISA 2.07B: included in the requirements of 64-bit processors as 64-bit category
      • ISA 3.1C: included in the Linux Compliancy subset and AIX Compliancy subset
    • lqarx/stqcx.: 128-bit LL/SC (powerpc64-only)
      Compatibility: ISA 2.07 or later
      • ISA 2.07B: included in the requirements of server processors as Load/Store Quadword category
      • ISA 3.1C: included in the Linux Compliancy subset and AIX Compliancy subset

    (Refs: Section 4.6.2 "Load And Reserve and Store Conditional Instructions" of Power ISA 3.1C Book II)

  • Atomic Memory Operation (AMO) Instructions

    • l{w,d}at: {32,64}-bit swap,fetch-and-{add,and,or,xor,max,min},etc. (powerpc64-only)
    • st{w,d}at: {32,64}-bit add,and,or,xor,max,min,etc. (powerpc64-only)

    Compatibility: ISA 3.0 or later

    • ISA 3.1C: included in the AIX Compliancy subset

    (Refs: Section 4.5 "Atomic Memory Operations" of Power ISA 3.1C Book II)

Load-store instructions are atomic only if properly aligned. LL/SC and AMO instructions require proper alignment, otherwise the system alignment error handler is invoked or the results are boundedly undefined.
(Refs: Section 1.4 "Single-Copy Atomicity", 4.6.2 "Load And Reserve and Store Conditional Instructions", and 4.5 "Atomic Memory Operations" of Power ISA 3.1C Book II)

Note that plq/pstq is not yet supported in LLVM (as of 19).

None of the above instructions imply a memory barrier.

  • A sync (sync 0, sync 0,0, hwsync) instruction can be used as both an “import barrier” and an “export barrier”.
    Compatibility: POWER1 or later (some BookE processors don't have this and provide msync instead)
    • ISA 2.07B: included in the requirements as Base category
    • ISA 3.1C: included in all compliancy subsets
  • A lwsync (sync 1, sync 1,0) instruction can be used as both an “import barrier” and an “export barrier”, if the specified storage location is in storage that is neither Write Through Required nor Caching Inhibited.
  • An “import barrier” can be constructed by a branch that depends on the loaded value (even a branch that depends on a comparison of the same register is okay), followed by an isync instruction.
    Compatibility: POWER1 or later
    • ISA 2.07B: included in the requirements as Base category
    • ISA 3.1C: included in all compliancy subsets

(Refs: Section 1.7.1 "Storage Access Ordering" and Section B.2 "Lock Acquisition and Release, and Related Techniques" of Power ISA 3.1C Book II)

sync corresponds to SeqCst semantics, lwsync corresponds to Acquire/Release semantics, and isync with appropriate sequence corresponds to Acquire semantics.

RISC-V

target_arch: riscv32, riscv64
Implementation: riscv.rs
Refs: RISC-V Instruction Set Manual

The following instructions are atomic if the address is properly aligned and the specified storage meets the requirements:

Of the above instructions, instructions other than relaxed load/store, can specify the memory ordering.
The mappings from the C/C++ atomic operations are described in the RISC-V Atomics ABI Specification.

Note: "A" extension comprises instructions provided by Zalrsc and Zaamo extensions, Zabha and Zacas extensions depends upon Zaamo extension.

s390x

target_arch: s390x
Implementation: s390x.rs
Refs: z/Architecture Principles of Operation (Fourteenth Edition)

The following instructions are atomic if the address is properly aligned and the specified storage meets the requirements:

  • Load/Store Instructions

    • All {8,16,32,64}-bit load/store instructions that having Single-Access References
      (Refs: Section "Storage-Operand Fetch References", "Storage-Operand Store References", and "Storage-Operand Consistency" of z/Architecture Principles of Operation, Fourteenth Edition)
    • LPQ/STPQ: 128-bit load/store (arch1 or later)
      (Refs: Section "LOAD PAIR FROM QUADWORD" and "STORE PAIR TO QUADWORD" of z/Architecture Principles of Operation, Fourteenth Edition)
  • Instructions that having Interlocked-Update References

    • TS: 8-bit TAS (360 or later)
    • CS{,Y,G}, CDS{,Y,G}: {32,64,128}-bit CAS (CS,CDS: 370 or later, CSG,CDSG: arch1 or later, CSY,CDSY: long-displacement facility added in arch3)
    • LAA{,G}, LAAL{,G}, LAN{,G}, LAO{,G}, LAX{,G}: {32,64}-bit fetch-and-{add,and,or,xor} (interlocked-access facility 1 added in arch9)
    • A{,G}SI, AL{,G}SI: {32,64}-bit add with immediate (interlocked-access facility 1 added in arch9)
    • NI{,Y}, OI{,Y}, XI{,Y}: 8-bit {and,or,xor} with immediate (interlocked-access facility 2 added in arch10)

    (Refs: Section "Storage-Operand Update References" of z/Architecture Principles of Operation, Fourteenth Edition)

Of the above instructions, instructions that having Interlocked-Update References other than STORE CHARACTERS UNDER MASK perform serialization.
(Refs: Section "CPU Serialization" of z/Architecture Principles of Operation, Fourteenth Edition)

The following instructions are usually used as standalone serialization:

  • BCR 15,0 (360 or later)
  • BCR 14,0 (fast-BCR-serialization facility added in arch9)

(Refs: Section "BRANCH ON CONDITION" of z/Architecture Principles of Operation, Fourteenth Edition)

Serialization corresponds to SeqCst semantics, all memory access has Acquire/Release semantics.

SPARC

target_arch: sparc, sparc64
Implementation: sparc.rs
Refs: The SPARC Architecture Manual (Version 9, Version 8)

The following instructions are atomic if the address is properly aligned and the specified storage meets the requirements:

  • Load/Store Instructions

    • V7 or later: {8,16,32}-bit
    • V8+,V9: 64-bit

    (Refs: Section D.4.1 "Value Atomicity" of the SPARC Architecture Manual, Version 9)

  • Compare-and-Swap Instructions

    • V8+,V9: {32,64}-bit CAS
    • V8 with LEONCASA: 32-bit CAS

    (Refs: Section 8.4.6 "Hardware Primitives for Mutual Exclusion" of the SPARC Architecture Manual, Version 9)

  • SWAP Instructions (deprecated in V9)

    • V7 or later: 32-bit swap

    (Refs: Section 8.4.6 "Hardware Primitives for Mutual Exclusion" and A.57 "Swap Register with Memory" of the SPARC Architecture Manual, Version 9)

  • Load Store Unsigned Byte Instructions

    • V7 or later: 8-bit TAS

    (Refs: Section 8.4.6 "Hardware Primitives for Mutual Exclusion" of the SPARC Architecture Manual, Version 9)

Memory access instructions require proper alignment, but some instructions are implementation-dependent and may work with insufficient alignment.
(Refs: Section 6.3.1.1 Memory Alignment Restrictions" of the SPARC Architecture Manual, Version 9)

Which memory barrier the above instructions imply depends on the memory model used. V8+ and V9 have three memory models: Total Store Order (TSO), Partial Store Order (PSO), and Relaxed Memory Order (RMO). V8 has only TSO and PSO. Implementation of TSO (or a more strongly ordered model which implies TSO) is mandatory, and PSO and RMO are optional.
(Refs: Section 8.4.4 "Memory Models" of the SPARC Architecture Manual, Version 9)

x86

target_arch: x86, x86_64
Implementation: x86.rs

TODO

Xtensa

target_arch: xtensa
Implementation: xtensa.rs

TODO