*** working document ***

SWC plus (SWC+) format specification

Reconstructed neuron morphologies play a very important role in the data exchange between neurophysiology labs and the general neuroscience community. The currently used data formats however are not good at maximizing the knowledge that can be extracted from the data. On the one hand there are closed source formats that have been developed over the years to support the ever expanding featureset of commercial tracing software that is often directly attached to the microscope setup. The most popular format being Neurolucida (MicroBrightField Inc.) in various incarnations. The schema behind these formats has been reverse-engineered to some extent, but remains elusive. On the other side of the spectrum is the very simple SWC format (Cannon, 1998), that was designed to store trees as connected cylindrical segments to form the basis of compartmental models. The conversion from the commercial, feature-rich formats to SWC comes with a big loss of information, for example all lines, markers or contours that mark features of the neuron and its position in the tissue are lost.

What all formats lack is a standard way to describe such features, there is no standard way to label the border between layer 1 and 2. "layer 1/2", "Layer 1", "border between layer 1 and 2"? This is the motivation to introduce a new format called SWC+. As the name suggests it is derived from SWC. It does however allow lines, contours, markers and even images and surfaces to be embedded, and if desired a neuronal tree can be distributed across multiple files. SWC+ comes with a Type Library that encourages the use of common terminology in describing properties of neurons and the surrounding tissue.

An SWC-compatible format with Extensible Types

SWC plus (SWC+) is an open source (github) format for storing neuron morphologies. It is both forward and backward compatible with the standardized SWC format used by NeuroMorpho.org. Standard SWC files contain a list of connected points that represent neurons as hierarchical trees. Every point has a TypeID that assigns it to a cell part. Standard SWC defines four standard cell parts: 1. Soma, 2. Axon, 3. (basal) Dendrite, 4. Apical dendrite. TypeIDs of 5 and up are allowed for custom types. However, custom types are rarely used since there is no way for software that reads SWC-files to know what the file's custom types represent. In SWC+, the support for custom types is drastically improved in three ways:

All custom Types must use a TypeID of 16 and up. It is assumed that existing SWC parsers will treat points with TypeIDs of 16 and up as unknown, and can therefore still extract trees (but not annotations) from SWC+ files. For this reason, SWC+ keeps using the .swc file extension.

The remainder of this document first specifies standard SWC, and then describes the SWC+ format in detail.

Specification of standard SWC

(Adapted from: Neuronland SWC specification)
An SWC file (S.W.C. encodes for the last names of its initial designers Ed Stockley, Howard Wheal, and Robert Cannon) is a text file that starts with a header section in which each line starts with the symbol #. Some SWC-variants use this section to store information about the data in an orderly fashion, others treat is as a free-text field. According to the original publication that introduced SWC, the header contains the following prescribed fields:
ORIGINAL_SOURCEFile type delivered by digitisation equipment
CREATURESpecies from which the cell came
REGIONBrain region
FIELD/LAYERLocation within region
TYPECell type
CONTRIBUTORName–initials, organisation, e.g. Turner–DA, Duke
REFERENCEWhere the data has been published
RAWFile name of original data
EXTRASFiles containing further information on this cell
SOMA_AREAArea of some (in mm2)
SHRINKAGE_CORRECTIONx, y and z correction factors
VERSION_NUMBERTo identify different versions of the same raw data
VERSION_DATEDate this version was created (yyyy-mm-dd)
SCALEUsed internally to record applied shrinkage corrections
However, these fields are not present in the NeuroMorpho.org SWC format and therefore cannot be considered a required component of SWC.

Below the header section, a points matrix with 7 columns follows. It contains the points traced along the neuronal tree. The seven numbers in each row are separated by spaces, and have the following meaning:

  1. SampleID : Sample identifier. A positive integer.
  2. TypeID : Type identifier. The basic set of types used in NeuroMorpho.org SWC files is:
    -1  - root
     0  - undefined
     1  - soma
     2  - axon
     3  - (basal) dendrite
     4  - apical dendrite
     5+ - custom
    In addition, some SWC-variants use the following types 5 and 6:
    5 - branch point (redundant: branch is a point with multiple children)
    6 - end point (redundant: end point is a point with zero children)
  3. x : X-position in micrometers
  4. y : Y-position in micrometers
  5. z : Z-position in micrometers
  6. r : Radius in micrometers (half the cylinder thickness)
  7. ParentID : Parent sample identifier. This defines how points are connected to each other. In a tree, multiple points can have the same ParentID. The first point in the file must have a ParentID equal to -1, which represents the root point. Parent samples must be defined before they are being referred to. By counting how many points refer to the a given parent, the number of its children can be computed.

Specification of the SWC+ extensions

Not all aspects of regular SWC are well defined. For example, how to deal with the redundant types 5 (branch point) and 6 (end point)? In SWC+, types 5 and 6 are first reset to 0, and then all points with TypeID 0 inherit the TypeID from their parent point. After this normalization step, the number of objects in the points matrix can be found by counting the number of point-sets that it contains. A point-set is defined as a set of connected points (through ParentID) that all have the same TypeID.

In SWC+, the free-text of the comment section in the file is replaced by an XML-formatted header that contains meta data and a list of all the Types that are used in the second column of the data matrix. For compatibility, each line in the header still starts with # but otherwise the header is written in XML and must be structured according to the SWC+ XML schema. The XML root element must use the tag <SWCplus>, and therefore the first line of an SWC+ file, after removing the # plus white space, starts with the signature

<SWCplus version="X"
where X is the SWC+ version number. If this signature is not found (as in regular SWC files), or if the header cannot be parsed as XML, then the header is considered to be a free-text comment.

A custom Type in SWC+ is created by assigning custom attributes to a Type selected from the SWC+ Type Library (STL). The STL includes the four base classes Soma, Axon, (basal) Dendrite and Apical dendrite, and these have their id fixed to 1, 2, 3 and 4, respectively to ensure compatibility with standard SWC. All other Types must be assigned id-s of 16 and up.

The STL defines the following Types:

A custom Type is defined in the SWC+ XML-header as an XML-element with tag equal to one of the Types from the STL, and attributes set to custom values. For example, the custom Type for the border between layer 3 and 4 is:

<LayerBorder name="Border between layers {a} and {b}" a="3" b="4"/>
The custom Type for the contour of an air bubble in a brain slice would be
<Contour id="17" name="Air bubble" closed="true"/>
Custom Types may also introduce new attributes, for example
<Soma restingPotential="-69 mV"/>

Custom Type definitions are not the only purpose of the SWC+ XML-header. It can also contain a brief and basic description of the experiment that underlies the data. The formal description of the XML-header is given by the SWC+ XML schema. The following XML-skeleton summarizes that schema:

The above skeleton is an attempt to cover the most important aspects of a morphology dataset in a compact header, inspired by the meta data listed at NeuroMorpho.org. The CustomTypes section is required, all other sections are optional.

Backward and forward compatibility with SWC

The five pre-assigned Types with id 0 to 4 in regular SWC are still present in SWC+.

Backward compatibility is achieved by treating headers that do not carry the SWC+ signature as free-text headers, and treating all TypeIDs larger than 6 as unknown Types.

Forward compatibility is achieved by using TypeID 16 and up for custom Types, and relying on the assumption that existing software will recognize these as unknown Types and will not treat them as part of any neurite tree.

SWC+ versioning

The root element of an SWC+ XML-header has a version attribute which is formatted as "X.Y" with X the major and Y the minor version number. The XML Schema for a given major release is found at https://neuroinformatics.nl/HBP/SWCplus/SWCplus_vX.xsd, where X is the major release. The root element of this schema has attributes version and maturity. As soon as maturity reaches the "stable" level, no new Types will be introduced for the given major release.

Software that reads SWC+ files must implement a version of the XML Schema that is larger than the version of the SWC+ file that it tries to parse. If not, the SWC+ file may contain unknown Types.

In practice, software that reads SWC+ does not need to validate every XML-header against the XML-schema, but in case of problematic files a run through an XML validation service may produce useful error messages.

Release notes

SWC+ is currently under construction, its maturity is "" and the latest version is .

SWC+ implementation

The SWC+ standard is originally developed by Rembrandt Bakker as an export format for the HBP Morphology viewer, which can read all Neurolucida variants, and import and repair SWC files with certain quirks. The user can select/deselect parts of the neuron and export to SWC+, see this document for implementation details.