Output data format
CTree writes its main outputs in binary format using C++
file streams (std::ofstream with std::ios::binary) mostly in little-endian byte order.
The binary format is designed for efficient I/O and compact storage of large tree datasets.
This page documents the exact binary layout of the output files and provides example routines for reading them.
Overview
Two binary files are written:
ctree_key.dat: Mapping between(Snapshot, Halo ID)and branch indicesctree_tree.dat: Full tree and branch data
Note
Type Specifiers
Each binary file begins with one or more type specifiers, which encode the data type used at compile time.
The type specifier is stored as a 32-bit integer with the following meaning:
Value |
Data type |
|---|---|
|
32-bit integer ( |
|
64-bit integer ( |
|
32-bit floating point ( |
|
64-bit floating point ( |
These specifiers allow the reader to determine the exact binary layout without prior knowledge of the build configuration.
Tree Key File Format
ctree_key.dat provides a mapping between catalog entries and internal
branch indices used in the tree structure.
The key array gives a mapping between (ID, Snapshot number) pairs and their indices in the merger tree branches:
branch_index = key[snapnumber + id * keyvalue]
In CTree the default key value is defined as the maximum snapshot number + 1
The file is written in the following order:
Type [Size] |
Description |
|---|---|
|
Type specifier for branch index ( |
|
Size of the key array ( |
|
Key value |
|
N elements in the key array |
Branch File Format
ctree_tree.dat stores the whole branch information.
The file header consists of the following 6 elements:
Type [Size] |
Description |
|---|---|
|
Type specifier for object ID ( |
|
Type specifier for snapshot number ( |
|
Type specifier for branch index ( |
|
Type specifier for merit value ( |
|
Total number of the branch array ( |
After the header, the file stores one record per tree, written sequentially in a loop over all trees.
Type [Size] |
Description |
|---|---|
|
Number of points in the main branch ( Note: If |
|
Number of merged progenitor branches ( |
|
If this branch is merged into another, the branch index that this branch is merged into |
|
Tree status flag (now it has no information) |
|
The lisf of IDs of the main branch |
|
The list of snapshot numbers of the main branch |
|
Note: If The list of IDs that merged into this branch |
|
The list of Snapshots that merged into this branch |
|
The list of merit scores when merged |
|
Branch indices that merged into this branch |
Reading example
Python
example/rdtree.py generates a pickle file containing the tree and key arrays.
The following script shows how to load the tree arrays and extract a branch of an object.
import pikcle as pickle
import numpy as np
# "ctree.pkl" is what rdtree.py generates
with open("ctree.pkl", 'rb') as f:
data = pickle.load(f)
key = data['key']
tree = data['tree']
# Extract a branch of galaxy with (ID=1 & Snap=100)
id0 = 1
snap0 = 100
keyv = key[snap0 + key[0]*id0]
if(keyv < 0 ):
print('no corresponding branch')
else:
branch = tree[keyv]
print(branch['br_len']) # Length of the branch
print(branch['id']) # List of IDs
print(branch['snap']) # List of snapshots
print(branch['merit']) # Merit score of the connections
print(branch['father_ID']) # If this branch is merged, the father branch index.
# The corresponding branch is tree[branch['father_ID']]
print(branch['n_mergebr']) # The number of branches that merged into this one
print(branch['m_id']) # ID list of the merged branches
print(branch['m_snap']) # Snapshot list of the merged branches
print(branch['m_merit']) # Merit of mergers
print(branch['m_bid']) # Indices of the merged branches
# , corresponding tree[ branch['m_bid'][:] ]
IDL
example/rdtree.pro generates a sav file containing the tree and key arrays.
; "ctree.sav" is what rdtree.pro generates
RESTORE, 'ctree.sav'
; Extract a branch of galaxy with (ID=1 & Snap=100)
id0 = 1L
snap0 = 100L
keyv = tree_key[snap0 + tree_key[0]*id0]
IF keyv LT 0L THEN BEGIN
PRINT, 'no corresponding branch'
ENDIF ELSE BEGIN
tree0 = tree_data[keyv] ;; tree_data is a pointer array
tree0 = *tree0
PRINT, tree0.br_len ; Length of the branch
PRINT, tree0.id ; List of IDs
PRINT, tree0.snap ; List of snapshots
PRINT, tree0.merit ; Merit score of the connections
PRINT, tree0.father_ID ; If this branch is merged, the father branch index.
; The corresponding branch is tree[branch['father_ID']]
PRINT, tree0.n_mergebr ; The number of branches that merged into this one
PRINT, tree0.m_id ; ID list of the merged branches
PRINT, tree0.m_snap ; Snapshot list of the merged branches
PRINT, tree0.m_merit ; Merit of mergers
PRINT, tree0.m_bid ; Indices of the merged branches
; , corresponding tree[ branch['m_bid'][:] ]
ENDELSE