String Manipulation Functions


Detailed Description

This file defines various utilities for manipulating the old character strings in C++ using new and delete operators. The C++ Standard Template Library (STL) does not handle utf8 and unicode, and these functions provide convenient low level string manipulation that is easily adapted to utf8, without the large overhead of the STL.

Todo:
***convert to using memcpy and memmove instead of stuff like s[c]=s[c+1]

*** some functions that have a dest and a src must be careful not to delete[] dest if src==dest, must go through and check for that

make a strprintf(&str,"format",args), would have to step through format, and figure out how big a string is needed, then allocate it for str.

perhaps wrap this in LaxStr namespace?


Functions

int isblank (const char *str)
char * itoa (int a, char *dest, int base)
 This turns an integer into a string with any base.
char * numtostr (int num, int par)
 Basic conversion of integer to string, created with new.
int squish (char *exprs, int p1, int p2)
 Removes characters p1 to p2 inclusive: [p1,p2].
char * numtostr (char *dest, int buflen, double num, int par)
 Converts a double into a char string, removing trailing zeroes.
char * numtostr (double num, int par)
 Converts a double into a char string, removing trailing zeroes.
char * newnstr (const char *str, int n)
 Return a new'd, null terminated duplicate of the first n characters of str.
char * newstr (const char *str)
 Return a new'd duplicate of str.
char * makestr (char *&dest, const char *src)
 Make dest a new copy of src.
char * makenstr (char *&dest, const char *src, unsigned int n)
 Like makestr, but only grabs the first n characters of src.
char * appendnstr (char *&dest, const char *src, int n)
 Append the first n characters of src to dest.
char * prependnstr (char *&dest, const char *src, int n)
 Prepend the first n characters of src to dest.
char * appendline (char *&dest, const char *src)
 Append src to dest, with an extra '\n' between if dest!=NULL to start.
char * appendstr (char *&dest, const char *src)
 Append src to dest.
char * prependstr (char *&dest, const char *src)
 Prepend src to dest (returning srcdest).
char * extendstr (char *&dest, int n)
 Expand how much memory dest takes up, and leave its contents the same.
char * extendstr (char *&dest, int &curmax, int n)
 Expand how much memory dest takes up, and leave its contents the same.
char * stripws (char *dest, char where)
 Strip whitespace. where&1 means in front, where&2 means trailing.
char * insertstr (char *&dest, const char *data, int atpos)
 Insert data into dest.
char * replace (char *&dest, const char *data, int s, int e, int *newe)
 Replace the characters from s up to and including e with data.
char * replaceall (const char *dest, const char *old, const char *newn, int s, int e)
 Replace all occurences of old in dest with newn. Does not modify dest. Returns new'd result.
char * replaceallname (const char *dest, const char *old, const char *newn)
 Replace all name occurences in dest of old with newn.
char * getnamestring (const char *buf)
 Get a new char array of any '_' or alphanumeric characters. Assumes no whitespace.
void deletestrs (char **&strs, int n)
 For char ** arrays, delete each element, then strs itself.
char ** split (const char *str, char delim, int *n_ret)
 Split str using delim as delimiter.
char ** spliton (char *str, char delim, int *n_ret)
 Split str using delim as delimiter, modifying the original str.
char ** splitspace (const char *stro, int *n_ret)
 Split stro into a NULL terminated char** of subfields, where whitespace is the delimiter.
char ** splitonspace (char *str, int *n_ret)
 Split str into a NULL terminated char** of subfields on whitespace, modifying str.
const char * lax_basename (const char *path)
 Return a pointer to the part of path that starts the file name, or NULL.
char * lax_dirname (const char *path, char appendslash)
 Returns a new char[] with the dir part of the path, or NULL.
char * increment_file (const char *file)
 Return a new name based on the old file plus one, so "file.jpg" will return "file2.jpg".


Function Documentation

char* appendline char *&  dest,
const char *  src
 

Append src to dest, with an extra '\n' between if dest!=NULL to start.

If dest is not NULL, delete[] dest is called. The new'd string is assigned to dest, which is returned also.

This is useful, for instance, if you want to concatenate many error messages, and don't want to worry about trailing newlines.

char* appendnstr char *&  dest,
const char *  src,
int  n
 

Append the first n characters of src to dest.

If dest is not NULL, delete[] dest is called. The new'd string is assigned to dest, which is returned also.

If n<=0 nothing is done. If n>strlen(src) then strlen(src) is used instead.

Todo:
*** question: is *&dest allowed to be NULL?

char* appendstr char *&  dest,
const char *  src
 

Append src to dest.

If dest is not NULL, delete[] dest is called. The new'd string is assigned to dest, which is returned also.

void deletestrs char **&  strs,
int  n
 

For char ** arrays, delete each element, then strs itself.

If n==0, then delete entries until the first NULL entry. Otherwise, delete any non-null entry from 0 to n-1. strs is set to NULL.

char* extendstr char *&  dest,
int &  curmax,
int  n
 

Expand how much memory dest takes up, and leave its contents the same.

Reassigns dest to a new char[] that takes up curmax+n bytes. Adjusts curmax to reflect the new curmax.

Returns:
Returns the new string.

char* extendstr char *&  dest,
int  n
 

Expand how much memory dest takes up, and leave its contents the same.

Reassigns dest to a new char[] that takes up strlen(dest)+n bytes.

Returns:
Returns the new string.

char* getnamestring const char *  buf  ) 
 

Get a new char array of any '_' or alphanumeric characters. Assumes no whitespace.

Returns:
Returns the new'd array, which the user must delete.

char* increment_file const char *  file  ) 
 

Return a new name based on the old file plus one, so "file.jpg" will return "file2.jpg".

"file3.jpg"->"file4.jpg", "blah"->"blah2"->"blah3", "blah001.jpg"->"blah2.jpg"

Note that currently, only the final extension is saved, meaning "blah.tar.gz" -> "blah.tar2.gz".

Todo:
should have "blah001.jpg"->"blah002.jpg", and perhaps optionally allow "blah.tar.gz" -> "blah2.tar.gz"

char* insertstr char *&  dest,
const char *  data,
int  atpos
 

Insert data into dest.

dest will be reassigned to a new char[]. If atpos==0, then this function is the same as prependstr(dest,data). If atpos<0 or atpos>=strlen(dest) then it is the same as appendstr(dest,data). Otherwise, for instance, insertstr(dest,data,3) will insert data starting at the 4th byte of dest.

If dest==NULL and data!=NULL, then make dest be a copy of data. If data is NULL, then return dest unchanged.

int isblank const char *  str  ) 
 

Nonzero when str is NULL (returns 1), "" (2), or a string of whitespace (3). Else 0. ! Checks up to n chars. If n<=0 then use strlen(str).

char* itoa int  a,
char *  dest,
int  base
 

This turns an integer into a string with any base.

base can be any number greater than 1 and less than 37. For bases greater than 10, lowercase letters are used. Does not null terminate, and assumes that dest is big enough to receive the digits.

Returns:
Returns a pointer to the character after the final character written into dest.

const char* lax_basename const char *  path  ) 
 

Return a pointer to the part of path that starts the file name, or NULL.

"1/2/" returns NULL. "" returns NULL. "1/2" returns "2". "1" returns "1".

Thus if n is the returned pointer, the dirname is just the first n-path characters of path.

Note that this differs from similar basename functions in that "1/" usually returns pointer to the '\0' after the "/", but lax_basename returns NULL.

char* lax_dirname const char *  path,
char  appendslash
 

Returns a new char[] with the dir part of the path, or NULL.

The calling code must delete[] what is returned.

So something like "blah.h" will return NULL. "yack/hack" will return "yack" or "yack/" if (appendslash).

"/" will return "" (not NULL), or "/" if appendslash.

char* makenstr char *&  dest,
const char *  src,
unsigned int  n
 

Like makestr, but only grabs the first n characters of src.

Returns:
Returns a pointer to what dest now points to.
If src==NULL and n>0, then make dest point to a new char[n], with str[0]=0.

char* makestr char *&  dest,
const char *  src
 

Make dest a new copy of src.

If dest is not NULL, dest is deleted first. If src is NULL, then dest is also made NULL.

Returns:
Returns a pointer to what dest now points to.

char* newnstr const char *  str,
int  n
 

Return a new'd, null terminated duplicate of the first n characters of str.

If str==NULL and n>0, then return a new char[n] with the first byte 0.

If n==0, then return an empty string (""), not NULL.

char* newstr const char *  str  ) 
 

Return a new'd duplicate of str.

If str==NULL, then return NULL.

char* numtostr double  num,
int  par
 

Converts a double into a char string, removing trailing zeroes.

if par!=0, parentheses are put around the number. A precision of 13 decimal places is used. The resulting null terminated string is put into a new char[].

char* numtostr char *  dest,
int  buflen,
double  num,
int  par
 

Converts a double into a char string, removing trailing zeroes.

if par!=0, parentheses are put around the number. A precision of 13 decimal places is used. The resulting string is put into the supplied dest, not exceeding buflen characters (including null termination). and no new string is created.

***** ignores buflen, what about printf("%g")??

char* numtostr int  num,
int  par
 

Basic conversion of integer to string, created with new.

If par!=0, parentheses are put around the number.

Returns:
Returns the new'd char array.

char* prependnstr char *&  dest,
const char *  src,
int  n
 

Prepend the first n characters of src to dest.

If dest is not NULL, delete[] dest is called. The new'd string is assigned to dest, which is returned also.

If n<=0 nothing is done. If n>strlen(src) then strlen(src) is used instead.

char* prependstr char *&  dest,
const char *  src
 

Prepend src to dest (returning srcdest).

If dest is not NULL, delete[] dest is called. The new'd string is assigned to dest, which is returned also.

char* replace char *&  dest,
const char *  data,
int  s,
int  e,
int *  newe
 

Replace the characters from s up to and including e with data.

dest is reassigned to a new char[]. If newe is not NULL, then put the new e into it.

If dest==NULL and data!=NULL, then make dest be a copy of data. If data is NULL, then return dest unchanged. These happen no matter what s and e are.

If s or e are out of bounds or e<s, then NULL is returned and dest is not changed.

Returns dest.

char* replaceall const char *  dest,
const char *  old,
const char *  newn,
int  s,
int  e
 

Replace all occurences of old in dest with newn. Does not modify dest. Returns new'd result.

If s and e are specified then replace only within the inclusive range [s,e].

This function is not efficient for large arrays (or small for that matter) It reallocs on finding each occurence. *** should probably rewrite to reallocate once.. means counting occurences of old, then substituting.

Todo:
**** this needs work and testing
*** maybe have option to replace dest rather than return new?

char* replaceallname const char *  dest,
const char *  old,
const char *  newn
 

Replace all name occurences in dest of old with newn.

Replace all of old names with newn in dest, without deleting dest. A name in this case means a continuous string of '_' or alphanumeric characters. Really it just searches for old in dest, then checks to make sure that the characters immediately before and after what it found are not '_' or a letter or a number. Then it puts newn where that old was. Like the vi command s/\<old\>/s//newn/g.

Returns:
Returns the new char string.

char** split const char *  str,
char  delim,
int *  n_ret
 

Split str using delim as delimiter.

The delimiter is removed. The number of fields is put into n_ret.

Returns a NULL terminated char** holding the fields which are new'd character arrays that are copies from the original str. The user must delete these itself, or call deltestrs(). Empty fields are created as string "".

Todo:
investigate strtok strsep, make a split version that puts '\0' on delimiters, does not allocate new strings, returns new'd array of char* that point to start of each field..

char** spliton char *  str,
char  delim,
int *  n_ret
 

Split str using delim as delimiter, modifying the original str.

The delimiter is replaced by '\0' in the original str. The number of fields is put into n_ret.

Returns a NULL terminated char** holding the fields which point to the start of each field within str. The user need only delete[] the returned array, not the individual elements.

char** splitonspace char *  str,
int *  n_ret
 

Split str into a NULL terminated char** of subfields on whitespace, modifying str.

Returns the number of fields created, not including the final NULL.

If str is only whitespace, then NULL is returned.

Any length of whitespace is the delimiter. Ignores initial and final whitespace.

Remember that you should not delete a sub-part of the returned array. That is, say you do 'char **t=splitonspace(str,&n)', then you can do 'delete[] t', but do NOT do 'delete[] t[2]', because t[2] points to the inside of the original str.

char** splitspace const char *  stro,
int *  n_ret
 

Split stro into a NULL terminated char** of subfields, where whitespace is the delimiter.

This does not modify stro. Creates new char[] to hold the fields. Returns the number of fields created, not including the final NULL. These strs can be easily deleted by calling deletestrs().

See splitonspace() for a splitter that modifies the string.

Any length of whitespace is the delimiter. Ignores initial and final whitespace

int squish char *  exprs,
int  p1,
int  p2
 

Removes characters p1 to p2 inclusive: [p1,p2].

Squish doesn't create new array, it just copies the later characters into previous positions.

Returns:
Returns the number of characters removed.

char* stripws char *  dest,
char  where
 

Strip whitespace. where&1 means in front, where&2 means trailing.

This does not create a new string. It merely moves characters in the string as appropriate, and repositions the final ''.


Tue Nov 6 08:46:48 2007, Laxkit