Skip to the content.

Architecture

System Design

xccmeta is a libclang wrapper optimized for metadata extraction and code generation pipelines. It trades generality for usability in the reflection/codegen domain.

Core Principles

1. Parse once, query many times

2. Tag-driven workflows

3. Libclang isolation

4. Build pipeline integration

Architecture Layers

┌─────────────────────────────────────────┐
│  User Code (code generators, tools)    │
├─────────────────────────────────────────┤
│  High-Level API                         │
│  - filter (deduplication, bulk ops)     │
│  - generator (formatted output)         │
│  - importer (file I/O + globs)          │
├─────────────────────────────────────────┤
│  Core AST API                           │
│  - node (tree structure, queries)       │
│  - type_info (type introspection)       │
│  - tags (metadata extraction)           │
│  - source (location tracking)           │
├─────────────────────────────────────────┤
│  Parser Layer                           │
│  - parser (libclang bridge)             │
│  - compile_args (flag management)       │
│  - [preprocessor (optional text dump)]  │
├─────────────────────────────────────────┤
│  libclang (LLVM 18.x)                   │
└─────────────────────────────────────────┘

Data Flow

Typical generator pipeline:

1. INPUT: C++ source files
   ↓
2. importer::get_files() → vector<file>
   ↓
3. for each file:
     file.read() → string
     ↓
4. parser.parse(source, args) → node_ptr (AST root)
   ↓
5. node::find_descendants(predicate) → vector<node_ptr>
   ↓
6. filter.add(nodes) + filter.clean() → deduplicated types
   ↓
7. for each type:
     Extract: type_info, tags, fields, methods
     ↓
8. generator.out(generated_code) → writes to file
   ↓
9. OUTPUT: Generated .hpp file

Memory Model

Ownership:

Lifetime:

Why shared_ptr:

Cost:

Preprocessing Model

Two-phase approach:

  1. Parser’s internal preprocessing (always happens)
    • libclang evaluates #define, #ifdef, #include
    • AST reflects post-preprocessor view
    • User doesn’t see expanded text
  2. Optional text preprocessing (preprocessor module)
    • Exposes macro-expanded source as string
    • Rarely needed (debugging, #line directives)
    • Separate from parsing (doesn’t affect AST)

Design rationale: Most users want AST metadata, not preprocessed text. Parser handles preprocessing transparently.

Extensibility Points

Adding new node kinds:

  1. Update node::kind enum (xccmeta_node.hpp:73)
  2. Map libclang cursor kind in parser_impl (internal)
  3. Add convenience methods to node class if needed

Adding type properties:

  1. Add member to type_info class (xccmeta_type_info.hpp:34)
  2. Update parser_impl to populate during AST construction
  3. Add getter method

Custom tag parsing:

Performance Characteristics

Parsing: O(n) in source size, ~1MB/s (varies by include depth)

Traversal: O(n) in node count

Type queries: O(1)

Filter deduplication: O(n²) in worst case

Error Handling Strategy

Parse failures: Return nullptr

File I/O: Exceptions

Invalid queries: Undefined behavior

Design rationale: Parsing is the only operation that commonly fails. File I/O errors are exceptional. Node queries assume correct usage (checked by user via get_kind()).

Build System Integration

CMake pattern:

add_custom_command(
  OUTPUT generated.hpp
  COMMAND my_generator ${CMAKE_SOURCE_DIR}/input.hpp
  DEPENDS input.hpp my_generator
)
add_library(mylib generated.hpp ...)

Generator tool:

Workflow: Edit source → CMake re-runs generator → Compiler sees updated output

Thread Safety

Not thread-safe:

Thread-safe after parsing:

Recommendation: Parse in parallel (separate parser per thread), merge results in main thread.

Versioning Strategy

API stability: Unstable (pre-1.0)

ABI stability: None

Compatibility: libclang 18.x

Design Trade-offs

Simplicity over generality:

Usability over performance:

Reflection-focused:

Result: Fast development of reflection-based tools. Not suitable for compiler-quality analysis or refactoring tools.