Note
Click here to download the full example code
AlignedDynamicTable¶
This is a user guide to interacting with AlignedDynamicTable objects.
Introduction¶
The class AlignedDynamicTable represents a column-based table
with support for grouping columns by category. AlignedDynamicTable
inherits from DynamicTable and may contain additional
DynamicTable objects, one per sub-category. All tables
must align, i.e., they are required to have the same number of rows. Some key features
of AlignedDynamicTable are:
support custom categories, each of which is a
DynamicTablestored as part of theAlignedDynamicTable,support interaction with category tables individually as well as treating the
AlignedDynamicTableas a single large table, andbecause
AlignedDynamicTableis itself aDynamicTableusers can:Use
DynamicTableRegionto reference rows inAlignedDynamicTableAdd custom columns to the
AlignedDynamicTable, andInteract with
AlignedDynamicTableas well as the category (sub-tables) it contains in the same fashion as withDynamicTable
When to use (and not use) AlignedDynamicTable?¶
AlignedDynamicTable is a useful data structure but it is also
fairly complex, consisting of multiple DynamicTable objects, each of
which is itself a complex type composed of many datasets and attributes. In general, if a simpler
data structure is sufficient, then consider using those instead. For example, consider using instead:
DynamicTableif a regular table is sufficient.A compound dataset via
Tableif all columns of a table are fixed and fast, column-based access is not critical but fast row-based access is.Multiple, separate tables if using
AlignedDynamicTablewould lead to duplication of data (i.e., de-normalize data), e.g., by having to replicate values across rows of the table.
Use AlignedDynamicTable when:
When you need to group columns in a
DynamicTableby categoryNeed to avoid name collisions between columns in a
DynamicTableand creating compound columns is not an option
Constructing a table¶
To create an AlignedDynamicTable, call the constructor with:
namestring with the name of the table, anddescriptionstring to describe the table.
from hdmf.common import AlignedDynamicTable
customer_table = AlignedDynamicTable(
name='customers',
description='an example aligned table',
)
Initializing columns of the primary table¶
The basic behavior of adding data and initalizing AlignedDynamicTable
is the same as in DynamicTable. See the DynamicTable How-To Guide
for details. E.g., using the columns and colnames parameters (which are inherited from
DynamicTable) we can define the columns of the primary table.
All columns must have the same length.
from hdmf.common import VectorData
col1 = VectorData(
name='firstname',
description='Customer first name',
data=['Peter', 'Emma']
)
col2 = VectorData(
name='lastname',
description='Customer last name',
data=['Williams', 'Brown']
)
customer_table = AlignedDynamicTable(
name='customer',
description='an example aligned table',
columns=[col1, col2]
)
Initializing categories¶
By specifying the category_tables as a list of DynamicTable
objects we can then directly specify the sub-category tables. Optionally, we can also set
the categories names of the sub-tables as an array of strings to define the ordering of categories.
from hdmf.common import DynamicTable
# create the home_address category table
subcol1 = VectorData(
name='city',
description='city',
data=['Rivercity', 'Mountaincity']
)
subcol2 = VectorData(
name='street',
description='street data',
data=['Amazonstreet', 'Alpinestreet']
)
homeaddress_table = DynamicTable(
name='home_address',
description='home address of the customer',
columns=[subcol1, subcol2]
)
# create the table
customer_table = AlignedDynamicTable(
name='customer',
description='an example aligned table',
columns=[col1, col2],
category_tables=[homeaddress_table, ]
)
# render the table in the online docs
customer_table.to_dataframe()
Adding more data to the table¶
We can add rows, columns, and new categories to the table.
Adding a row¶
To add a row via add_row we
can either: 1) provide the row data as a single dict to the data parameter or
2) specify a dict for each category and column as keyword arguments. Additional
optional arguments include id and enforce_unique_id.
customer_table.add_row(
firstname='Paul',
lastname='Smith',
home_address={'city': 'Bugcity',
'street': 'Beestree'}
)
# render the table in the online docs
customer_table.to_dataframe()
Adding a column¶
To add a columns we use add_column.
customer_table.add_column(
name='zipcode',
description='zip code of the city',
data=[11111, 22222, 33333], # specify data for the 3 rows in the table
category='home_address' # use None (or omit) to add columns to the primary table
)
# render the table in the online docs
customer_table.to_dataframe()
Adding a category¶
To add a new DynamicTable as a category,
we use add_category.
Note
Only regular DynamicTables are allowed as category tables. Using
an AlignedDynamicTable as a category for another AlignedDynamicTable
is currently not supported.
# create a new category DynamicTable for the work address
subcol1 = VectorData(
name='city',
description='city',
data=['Busycity', 'Worktown', 'Labortown']
)
subcol2 = VectorData(
name='street',
description='street data',
data=['Cannery Row', 'Woodwork Avenue', 'Steel Street']
)
subcol3 = VectorData(
name='zipcode',
description='zip code of the city',
data=[33333, 44444, 55555])
workaddress_table = DynamicTable(
name='work_address',
description='home address of the customer',
columns=[subcol1, subcol2, subcol3]
)
# add the category to our AlignedDynamicTable
customer_table.add_category(category=workaddress_table)
# render the table in the online docs
customer_table.to_dataframe()
Note
Because each category is stored as a separate DynamicTable
there are no name collisions between the columns of the home_address and work_address
tables, so that both can contain matching city, street, and zipcode columns. However,
since a category table is a sub-part of the primary table, categories must not have the
same name as other columns or other categories in the primary table.
Accessing categories, columns, rows, and cells¶
Convert to a pandas DataFrame¶
If we need to access the whole table for analysis, then converting the table
to pandas DataFrame is a convenient option. To ignore the id columns of all
category tables we can simply set the ignore_category_ids parameter.
# render the table in the online docs while ignoring the id column of category tables
customer_table.to_dataframe(ignore_category_ids=True)
Accessing categories¶
# Get the list of all categories
_ = customer_table.categories
# Get the DynamicTable object of a particular category
_ = customer_table.get_category(name='home_address')
# Alternatively, we can use normal array slicing to get the category as a pandas DataFrame.
# NOTE: In contrast to the previous call, the table is here converted to a DataFrame.
_ = customer_table['home_address']
Accessing columns¶
We can use the standard Python in operator to check if a column exists
# To check if a column exists in the primary table we only need to specify the column name
# or alternatively specify the category as None
_ = 'firstname' in customer_table
_ = (None, 'firstname') in customer_table
# To check if a column exists in a category table we need to specify the category
# and column name as a tuple
_ = ('home_address', 'zipcode') in customer_table
We can use standard array slicing to get the VectorData object of a column.
# To get a column from the primary table we just provide the name.
_ = customer_table['firstname']
# To get a column from a category table we provide both the category name and column name
_ = customer_table['home_address', 'city']
Accessing rows¶
Accessing rows works much like in DynamicTable How-To Guide
# Get a single row by index as a DataFrame
customer_table[1]
# Get a range of rows as a DataFrame
customer_table[0:2]
# Get a list of rows as a DataFrame
customer_table[[0, 2]]
Accessing cells¶
To get a set of cells we need to specify the: 1) category, 2) column, and 3) row index when slicing into the table.
When selecting from the primary table we need to specify None for the category, followed by the column name and the selection.
# Select rows 0:2 from the 'firstname' column in the primary table
customer_table[None, 'firstname', 0:2]
['Peter', 'Emma']
# Select rows 1 from the 'firstname' column in the primary table
customer_table[None, 'firstname', 1]
'Emma'
# Select rows 0 and 2 from the 'firstname' column in the primary table
customer_table[None, 'firstname', [0, 2]]
['Peter', 'Paul']
# Select rows 0:2 from the 'city' column of the 'home_address' category table
customer_table['home_address', 'city', 0:2]
['Rivercity', 'Mountaincity']