FARGOS/VISTA Object Management Environment Core  ..
FARGOS/VISTA Object Management Environment Core Table of Contents
CSV_File Class Reference

Read and process Comma-Separated-Value (or equivalent) files a record at a time. More...

#include <read_file.hpp>

+ Inheritance diagram for CSV_File:

Public Types

enum  { MAX_FIELDS = 128 }
 
- Public Types inherited from Read_And_Process_File
enum  ReadModes {
  READ_GUESS =0, READ_NORMAL =1, READ_ZLIB =2, READ_LZ4 =4,
  READ_PCAP =8, WRITE =16, WRITE_APPEND =WRITE | 32, WRITE_TRUNCATE =WRITE | 64,
  READ_FROM_SOCKET =128
}
 
typedef int(* ReadDataFP) (class Read_And_Process_File *input, unsigned char *bfr, uint32_t bfrLen)
 

Public Member Functions

 CSV_File (OS_HANDLE_TYPE descriptor, bool hasHeader=true, char sepChar=',')
 Parse conventional CSV file. More...
 
 CSV_File (OS_HANDLE_TYPE descriptor, const char *separatorList, const char *quoteChars="\"", bool hasHeader=true)
 Parse a CSV-style file that has more than one column separator character in use. More...
 
virtual ~CSV_File ()
 
int getFieldIndex (const char *fieldHeading) const
 Return the relative subscript for a named column. More...
 
int parseIntoFields (int maxFields, char *fieldStart[], unsigned char *line, size_t lineLen)
 Parse text line into fields. More...
 
int readAndProcessCSV_File ()
 Top-level routine to parse CSV file. More...
 
int processPacket (unsigned char *data, size_t len)
 Streaming interface equivalent to readAndProcessCSV_File(). API is compatible with processPacketUsingClass<> convenience interface for the IO_Processor class processing routine. More...
 
virtual int processHeaderLine (unsigned char *line, size_t lineLen) VIRTUAL_OVERRIDE
 Process initial header line in file. More...
 
virtual int parsedHeadingLine ()
 User-exit invoked when initial header line is parsed. More...
 
virtual int completedFile (int recordsSeen) VIRTUAL_OVERRIDE
 User-exit to notify that file has been completely read. More...
 
- Public Member Functions inherited from Read_And_Process_File
 Read_And_Process_File (OS_HANDLE_TYPE srcDescriptor, ReadModes mode=READ_NORMAL, const FileTypeReaderSelector *selectorTable=nullptr)
 Construct from an existing file descriptor. More...
 
 Read_And_Process_File (const char *fileName, ReadModes mode=READ_NORMAL, const FileTypeReaderSelector *selectorTable=nullptr)
 Construct given the name of a file to be opened. More...
 
virtual ~Read_And_Process_File ()
 
void setReadRoutine (ReadDataFP altRoutine) OME_ALWAYS_INLINE
 Set a new file read routine. More...
 
virtual void noteDataRead (const unsigned char *bfr, size_t bfrLen) const
 user-exit to see original copy of any data read More...
 
int readAndProcessBlocksFromFile (size_t recordLength)
 Process fixed length records. More...
 
int findAndProcessNextLine (File_Buffer *fileBfr, bool hasHeaderLine=false)
 Process next text line from buffer. More...
 
int readAndProcessTextLines (bool hasHeaderLine=false)
 Process text lines. More...
 
int readIntoFileBuffer (File_Buffer *bfr)
 
int readAndProcessFile ()
 Process file contents with no imposed structure. More...
 
virtual int beginFile ()
 
virtual int processLine (const unsigned char *line, size_t lineLen)
 
virtual int_fast32_t processBlock (unsigned char *block, size_t blockLen)
 
virtual int processBuffer (File_Buffer *bfrState)
 

Protected Attributes

unsigned char * headerLine
 
unsigned char separatorCharList [8]
 
unsigned char quoteCharList [8]
 
char * fields [MAX_FIELDS]
 
int fieldTotal
 
uint32_t readsPerformed
 
unsigned char separatorListLen
 
unsigned char quoteListLen
 
bool doHeaderLine
 
- Protected Attributes inherited from Read_And_Process_File
File_BufferintermediateBuffer
 

Additional Inherited Members

- Static Public Member Functions inherited from Read_And_Process_File
static int defaultReadRoutine (Read_And_Process_File *input, unsigned char *bfr, uint32_t bfrLen)
 Default read routine. More...
 
static const FileTypeReaderSelectorfindTypeOfFile (const char *fileName, const FileTypeReaderSelector *selectorTable)
 
static OS_HANDLE_TYPE openFile (const char *fileName, ReadModes mode=READ_NORMAL)
 Open the indicated file. More...
 
static int closeFile (OS_HANDLE_TYPE fd, ReadModes mode=READ_NORMAL)
 Close an native operating system file handle. More...
 
static int findFileInPathsWithSuffixes (char *path, uint_fast32_t pathLen, const char *searchRootPaths, const char *possibleFilenames="", const char *possibleSuffixes="")
 Search for a file using a combination of directory roots and file suffixes. More...
 
- Public Attributes inherited from Read_And_Process_File
ReadDataFP readRoutine
 
void * auxData
 
OS_HANDLE_TYPE descriptor
 
uint32_t recordsProcessed
 
enum ReadModes readMode
 
uint32_t _explicitAlignmentPadding
 
- Protected Member Functions inherited from Read_And_Process_File
virtual int readIntoBuffer (unsigned char *bfr, size_t bfrLen)
 
- Static Protected Member Functions inherited from Read_And_Process_File
static uint32_t matchFileHeader (const FileTypeReaderSelector *criteria, const unsigned char *fileHdr, const size_t hdrLen)
 

Detailed Description

Read and process Comma-Separated-Value (or equivalent) files a record at a time.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum
Enumerator
MAX_FIELDS 

Constructor & Destructor Documentation

◆ CSV_File() [1/2]

CSV_File::CSV_File ( OS_HANDLE_TYPE  descriptor,
bool  hasHeader = true,
char  sepChar = ',' 
)
explicit

Parse conventional CSV file.

Parameters
descriptoris the descriptor/handle of an already open file.
hasHeaderis an optional argument that indicates if the file has an initial header describing the column names.
sepCharis an optional argument indicating the separator character used between columns; it defaults to a comma.

By default, the double quote mark can be used to group elements as a single field.

References doHeaderLine, fieldTotal, headerLine, quoteCharList, quoteListLen, readsPerformed, separatorCharList, and separatorListLen.

◆ CSV_File() [2/2]

CSV_File::CSV_File ( OS_HANDLE_TYPE  descriptor,
const char *  separatorList,
const char *  quoteChars = "\"",
bool  hasHeader = true 
)

Parse a CSV-style file that has more than one column separator character in use.

Parameters
descriptoris the descriptor/handle of an already open file.
separatorListis a pointer to text string listing possible separator characters used between columns.
quoteCharsis an optional pointer to a text string listing possible quote characters.
hasHeaderis an optional argument that indicates if the file has an initial header describing the column names.

References doHeaderLine, fieldTotal, headerLine, LOG_CERR, LOG_ENDLINE, OME_EXPECT_FALSE, quoteCharList, quoteListLen, readsPerformed, separatorCharList, and separatorListLen.

◆ ~CSV_File()

virtual CSV_File::~CSV_File ( )
inlinevirtual

References headerLine.

Member Function Documentation

◆ completedFile()

int CSV_File::completedFile ( int  recordsSeen)
virtual

User-exit to notify that file has been completely read.

Return values
0for now, always return 0. The value is currently ignored.

Reimplemented from Read_And_Process_File.

References LOG_CERR, LOG_ENDLINE, and Read_And_Process_File::recordsProcessed.

Referenced by processPacket().

◆ getFieldIndex()

int CSV_File::getFieldIndex ( const char *  fieldHeading) const

Return the relative subscript for a named column.

Parameters
fieldHeadingspecifies the name of the desired column
Returns
The relative index of the desired column (starting at 0) is returned.
Return values
-1indicates the column was not found.

References fields, fieldTotal, LOG_CERR, and LOG_ENDLINE.

◆ parsedHeadingLine()

virtual int CSV_File::parsedHeadingLine ( )
inlinevirtual

User-exit invoked when initial header line is parsed.

Usually, this is used to make several calls to getFieldIndex() to obtain the corresponding subscripts of the desired columns, thus performing the expensive work only once per file.

Return values
0for now, always return 0. The value is currently ignored.

Referenced by processHeaderLine().

◆ parseIntoFields()

int CSV_File::parseIntoFields ( int  maxFields,
char *  fieldStart[],
unsigned char *  line,
size_t  lineLen 
)

Parse text line into fields.

Fields are separated by any character specified in the separator list, unless inside a quoted sequence. Quoted sequences are started by any character in the quoted character list.

Parameters
maxFieldsindicates the number of elements available in fieldStart.
fieldStartis an array of character pointers into which will be stored a pointer to each field's value.
linepoints to the line to be parsed. It is expected that it will end in a newline character. NOTE: the contents of line will be modified.
lineLenis the length of the line; the last character (which would be line[lineLen]) should be a newline.
Returns
The number of fields parsed is returned. Only pointers to the first maxFields are stored into the fieldStart array, but the total number of fields seen is returned.

References AS_HEXADECIMAL_BUFFER, LOG_CERR, LOG_ENDLINE, OME_EXPECT_FALSE, OME_EXPECT_TRUE, quoteCharList, quoteListLen, separatorCharList, and separatorListLen.

Referenced by processHeaderLine().

◆ processHeaderLine()

int CSV_File::processHeaderLine ( unsigned char *  line,
size_t  lineLen 
)
virtual

Process initial header line in file.

Normally called by readAndProcessTextLines().

Reimplemented from Read_And_Process_File.

References fields, fieldTotal, headerLine, LOG_CERR, LOG_ENDLINE, MAX_FIELDS, parsedHeadingLine(), and parseIntoFields().

◆ processPacket()

int CSV_File::processPacket ( unsigned char *  data,
size_t  len 
)
inline

◆ readAndProcessCSV_File()

int CSV_File::readAndProcessCSV_File ( )
inline

Top-level routine to parse CSV file.

A trivial cover routine to invoke readAndProcessTextLines() with the appropriate heading mode.

Returns
The value from readAndProcessTextLines() is returned.

References doHeaderLine, and Read_And_Process_File::readAndProcessTextLines().

Member Data Documentation

◆ doHeaderLine

bool CSV_File::doHeaderLine
protected

◆ fields

char* CSV_File::fields[MAX_FIELDS]
protected

Referenced by getFieldIndex(), and processHeaderLine().

◆ fieldTotal

int CSV_File::fieldTotal
protected

◆ headerLine

unsigned char* CSV_File::headerLine
protected

◆ quoteCharList

unsigned char CSV_File::quoteCharList[8]
protected

Referenced by CSV_File(), and parseIntoFields().

◆ quoteListLen

unsigned char CSV_File::quoteListLen
protected

Referenced by CSV_File(), and parseIntoFields().

◆ readsPerformed

uint32_t CSV_File::readsPerformed
protected

Referenced by CSV_File(), and processPacket().

◆ separatorCharList

unsigned char CSV_File::separatorCharList[8]
protected

Referenced by CSV_File(), and parseIntoFields().

◆ separatorListLen

unsigned char CSV_File::separatorListLen
protected

Referenced by CSV_File(), and parseIntoFields().


The documentation for this class was generated from the following files:
Generated: Tue Jul 28 2020 16:03:27
Support Information