VDB-2 - PlacementSetIterator

Type:
interface

Header:
align/iterator.h

Revision History:
2012-Feb-06rodarmer
2012-Feb-07rodarmer
2012-May-04documented negative placement starting coords

Contents:


Description

The PlacementSetIterator is an interface that allows for walking a window of placements along the reference of a single run. On each iteration, one or more placements become available at a position until the placements are exhausted within the window.

It differs from a PlacementIterator in that it contains a set of the latter and performs sorted access to them. Otherwise, it presents an identical interface.

At this point we are considering keeping the interfaces separate to avoid making them polymorphic, but they should not be allowed to diverge in order to protect the possibility of making them polymorphic in the future.

Requirements

  1. Must adhere to all requirements of PlacementIterator
  2. Must remain compatible in case interfaces are made polymorphic

PlacementSetIterator

Make

ask the alignment manager to create an iterator from individual components

rc_t AlignMgrMakePlacementSetIterator ( const AlignMgr *self,
    PlacementSetIterator **iter, uint64_t ref_pos, uint32_t ref_len );

iter - OUT
return parameter for the iterator

ref_pos
starting position of alignment in reference coordinates

ref_len
length of projection onto reference in reference space

sdf

AddRef

duplicate an existing reference

rc_t PlacementSetIteratorAddRef ( const PlacementSetIterator *self );

The object is defined as being reference counted. In VDB-2, references are direct pointers to objects and the objects maintain a reference counter.

Release

release an existing reference
potentially whacks object

rc_t PlacementSetIteratorRelease ( const PlacementSetIterator *self );

The object is defined as being reference counted. In VDB-2, references are direct pointers to objects and the objects maintain a reference counter.

NULL pointers are ignored.

AddPlacementIterator

add an actual iterator to the set

rc_t PlacementSetIteratorAddPlacementIterator ( PlacementSetIterator *self, PlacementIterator *pi );

pi
an iterator over a single pair of reference and alignment tables

The job of the PlacementIterator is to walk horizontally across all placements within a window of the reference. The job of the PlacementSetIterator is to walk vertically over all existing pairs.

This is the means of adding sub-iterators to the set.

The code is required to behave properly regardless of the number of iterators in the set.

NextReference

advance to the next reference

rc_t PlacementSetIteratorNextReference ( PlacementSetIterator *self,
    INSDC_coord_zero *first_pos, INSDC_coord_len *len, struct ReferenceObj const ** refobj );

first_pos - OUT
start position of the first alignment on the next reference

len - OUT
outer length of the alignments on the next reference

refobj - OUT
next reference

This function has to be called (at least once) to advance to the next reference.

NextWindow

advance to the next window

rc_t PlacementSetIteratorNextWindow ( PlacementSetIterator *self,
    INSDC_coord_zero *first_pos, INSDC_coord_len *len );

first_pos - OUT
start position of the window

len - OUT
length of the window

This function has to be called (at least once) to advance to the next window on the reference reference.

NextAvailPos

check the next available position on reference having one or more placements
returns position and optionally length

rc_t PlacementSetIteratorNextAvailPos ( const PlacementSetIterator *self,
    uint64_t *pos, uint64_t *len );

pos - OUT
the reference position where the next available placement starts NB - can be negative if the alignment wraps around

len - OUT, NULL OKAY
optional parameter returning the length of the next available placement

This message returns information about the next available placement, or if none are available, causes the iterator to search for more in its open cursors.

If no further placements are found, a non-zero return code will be issued (exact code TBD). An empty set will return this code immediately.

The exact position returned is used to read placement records using either NextRecordAt or NextIdAt.

The optional returned length is useful for performing a merge-sort on the available placements from several iterators. This message may be safely invoked any number of times, where the only side-effect possible is a single attempt at retrieving more data (on the initial invocation).

NextRecordAt

retrieve and consume next available PlacementRecord

rc_t PlacementSetIteratorNextRecordAt ( PlacementSetIterator *self,
    uint64_t pos, const PlacementRecord **rec );

pos
the exact position returned by
NextAvailPos
identifies location being queried

rec - OUT
return parameter for the next available placement at pos

This message allows a single record to be obtained on each invocation, where the intent is that the caller will loop until no further records are found at the stated position.

By looping, the code is not forced to create lists of placements that align at the exact same starting point, which further allows using multiple iterators in a sort-merge configuration.

As mentioned before, the record is designed to be held in a doubly-linked list and freed independently. The caller obtains locally sorted records from this iterator and places them into the list.

NextIdAt

retrieve information from the next available PlacementRecord
douse the record upon return

rc_t PlacementSetIteratorNextIdAt ( PlacementSetIterator *self,
    uint64_t pos, int64_t *row_id, uint64_t *len );

pos
the exact position returned by
NextAvailPos
identifies location being queried

row_id - OUT
return parameter for the next placement's id

len - OUT, NULL OKAY
optional return parameter for the next placement's length

This message simply extracts information held within internal records. See NextRecordAt.

example:

PlacementSetIterator *pl_set_iter;
rc_t rc = AlignMgrMakePlacementSetIterator( alignment_manager, &pl_set_iter );
if ( rc == 0 )
{
    uint32_t i;
    for ( i = 0; i < n && rc == 0; ++i )
    {
        PlacementIterator *pl_iter;
        rc = make_pl_iter( &pl_iter );  /* user supplied helper function to make a placement-iterator */
        if ( rc == 0 )
            rc = PlacementSetIteratorAddPlacementIterator( pl_set_iter, pl_iter );
    }
    while ( rc == 0 )
    {
        struct ReferenceObj const *refobj;
        INSDC_coord_zero first_pos;
        INSDC_coord_len len;
        rc = PlacementSetIteratorNextReference( pl_set_iter, &first_pos, &len, &refobj );
        if ( rc == 0 )
        {
            while ( rc == 0 )
            {
                INSDC_coord_zero w_pos;
                INSDC_coord_len w_len;
                rc = PlacementSetIteratorNextWindow ( pl_set_iter, &w_pos, &w_len );
                if ( rc == 0 )
                {
                    while ( rc == 0 )
                    {
                        INSDC_coord_len len;
                        INSDC_coord_zero pos;
                        rc = PlacementSetIteratorNextAvailPos ( pl_set_iter, &pos, &len );
                        if ( rc == 0 )
                        {
                            while ( rc == 0 )
                            {
                                const PlacementRecord *rec;
                                rc = PlacementSetIteratorNextRecordAt ( pl_set_iter, pos, &rec );
                                if ( rc == 0 )
                                {
                                    /* handle the placement-record... */
                                }
                            }
                            if ( GetRCState( rc ) == rcDone ) rc = 0;
                        }
                    }
                    if ( GetRCState( rc ) == rcDone ) rc = 0;
                }
            }
            if ( GetRCState( rc ) == rcDone ) rc = 0;
        }
    }
    PlacementSetIteratorRelease( pl_set_iter );
}

NCBI VDB-2 Documentation