-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.19 2002/01/20 22:19:55 petere Exp $ -->
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.19.2.1 2002/03/17 20:05:58 tgl Exp $ -->
<chapter id="arrays">
<title>Arrays</title>
As shown, an array data type is named by appending square brackets
(<literal>[]</>) to the data type name of the array elements.
The above query will create a table named
- <structname>sal_emp</structname> with a <type>text</type> string
- (<structfield>name</structfield>), a one-dimensional array of type
+ <structname>sal_emp</structname> with columns including
+ a <type>text</type> string (<structfield>name</structfield>),
+ a one-dimensional array of type
<type>integer</type> (<structfield>pay_by_quarter</structfield>),
which represents the employee's salary by quarter, and a
two-dimensional array of <type>text</type>
Now we do some <command>INSERT</command>s. Observe that to write an array
value, we enclose the element values within curly braces and separate them
by commas. If you know C, this is not unlike the syntax for
- initializing structures.
+ initializing structures. (More details appear below.)
<programlisting>
INSERT INTO sal_emp
</programlisting>
The array subscript numbers are written within square brackets.
- <productname>PostgreSQL</productname> uses the
+ By default <productname>PostgreSQL</productname> uses the
<quote>one-based</quote> numbering convention for arrays, that is,
an array of <replaceable>n</> elements starts with <literal>array[1]</literal> and
ends with <literal>array[<replaceable>n</>]</literal>.
schedule
--------------------
- {{"meeting"},{""}}
+ {{meeting},{""}}
(1 row)
</programlisting>
those already present, or by assigning to a slice that is adjacent
to or overlaps the data already present. For example, if an array
value currently has 4 elements, it will have five elements after an
- update that assigns to array[5]. Currently, enlargement in this
- fashion is only allowed for one-dimensional arrays, not
+ update that assigns to <literal>array[5]</>. Currently, enlargement in
+ this fashion is only allowed for one-dimensional arrays, not
multidimensional arrays.
</para>
+ <para>
+ Array slice assignment allows creation of arrays that do not use one-based
+ subscripts. For example one might assign to <literal>array[-2:7]</> to
+ create an array with subscript values running from -2 to 7.
+ </para>
+
<para>
The syntax for <command>CREATE TABLE</command> allows fixed-length
arrays to be defined:
Actually, the current implementation does not enforce the declared
number of dimensions either. Arrays of a particular element type are
all considered to be of the same type, regardless of size or number
- of dimensions.
+ of dimensions. So, declaring number of dimensions or sizes in
+ <command>CREATE TABLE</command> is simply documentation, it does not
+ affect runtime behavior.
</para>
<para>
</para>
</note>
+ <formalpara>
+ <title>Array input and output syntax.</title>
+ <para>
+ The external representation of an array value consists of items that
+ are interpreted according to the I/O conversion rules for the array's
+ element type, plus decoration that indicates the array structure.
+ The decoration consists of curly braces (<literal>{</> and <literal>}</>)
+ around the array value plus delimiter characters between adjacent items.
+ The delimiter character is usually a comma (<literal>,</>) but can be
+ something else: it is determined by the <literal>typdelim</> setting
+ for the array's element type. (Among the standard datatypes provided
+ in the <productname>PostgreSQL</productname> distribution, type
+ <literal>box</> uses a semicolon (<literal>;</>) but all the others
+ use comma.) In a multidimensional array, each dimension (row, plane,
+ cube, etc.) gets its own level of curly braces, and delimiters
+ must be written between adjacent curly-braced entities of the same level.
+ You may write whitespace before a left brace, after a right
+ brace, or before any individual item string. Whitespace after an item
+ is not ignored, however: after skipping leading whitespace, everything
+ up to the next right brace or delimiter is taken as the item value.
+ </para>
+ </formalpara>
+
<formalpara>
<title>Quoting array elements.</title>
<para>
- As shown above, when writing an array literal value you may write double
+ As shown above, when writing an array value you may write double
quotes around any individual array
element. You <emphasis>must</> do so if the element value would otherwise
confuse the array-value parser. For example, elements containing curly
- braces, commas, double quotes, backslashes, or white space must be
- double-quoted. To put a double quote or backslash in an array element
- value, precede it with a backslash.
+ braces, commas (or whatever the delimiter character is), double quotes,
+ backslashes, or leading white space must be double-quoted. To put a double
+ quote or backslash in an array element value, precede it with a backslash.
+ Alternatively, you can use backslash-escaping to protect all data characters
+ that would otherwise be taken as array syntax or ignorable white space.
</para>
</formalpara>
+ <para>
+ The array output routine will put double quotes around element values
+ if they are empty strings or contain curly braces, delimiter characters,
+ double quotes, backslashes, or white space. Double quotes and backslashes
+ embedded in element values will be backslash-escaped. For numeric
+ datatypes it is safe to assume that double quotes will never appear, but
+ for textual datatypes one should be prepared to cope with either presence
+ or absence of quotes. (This is a change in behavior from pre-7.2
+ <productname>PostgreSQL</productname> releases.)
+ </para>
+
<tip>
<para>
Remember that what you write in an SQL query will first be interpreted
*
*
* IDENTIFICATION
- * $Header: /cvsroot/pgsql/src/backend/utils/adt/arrayfuncs.c,v 1.72 2001/11/29 21:02:41 tgl Exp $
+ * $Header: /cvsroot/pgsql/src/backend/utils/adt/arrayfuncs.c,v 1.72.2.1 2002/03/17 20:05:59 tgl Exp $
*
*-------------------------------------------------------------------------
*/
* Local definitions
* ----------
*/
-#ifndef MIN
-#define MIN(a,b) (((a)<(b)) ? (a) : (b))
-#endif
-#ifndef MAX
-#define MAX(a,b) (((a)>(b)) ? (a) : (b))
-#endif
-
#define ASSGN "="
#define RETURN_NULL(type) do { *isNull = true; return (type) 0; } while (0)
-static int ArrayCount(char *str, int *dim, int typdelim);
+static int ArrayCount(char *str, int *dim, char typdelim);
static Datum *ReadArrayStr(char *arrayStr, int nitems, int ndim, int *dim,
FmgrInfo *inputproc, Oid typelem, int32 typmod,
char typdelim, int typlen, bool typbyval,
*-----------------------------------------------------------------------------
*/
static int
-ArrayCount(char *str, int *dim, int typdelim)
+ArrayCount(char *str, int *dim, char typdelim)
{
int nest_level = 0,
i;
temp[MAXDIM];
bool scanning_string = false;
bool eoArray = false;
- char *q;
+ char *ptr;
for (i = 0; i < MAXDIM; ++i)
temp[i] = dim[i] = 0;
if (strncmp(str, "{}", 2) == 0)
return 0;
- q = str;
- while (eoArray != true)
+ ptr = str;
+ while (!eoArray)
{
- bool done = false;
+ bool itemdone = false;
- while (!done)
+ while (!itemdone)
{
- switch (*q)
+ switch (*ptr)
{
- case '\\':
- /* skip escaped characters (\ and ") inside strings */
- if (scanning_string && *(q + 1))
- q++;
- break;
case '\0':
-
- /*
- * Signal a premature end of the string. DZ -
- * 2-9-1996
- */
+ /* Signal a premature end of the string */
elog(ERROR, "malformed array constant: %s", str);
break;
+ case '\\':
+ /* skip the escaped character */
+ if (*(ptr + 1))
+ ptr++;
+ else
+ elog(ERROR, "malformed array constant: %s", str);
+ break;
case '\"':
scanning_string = !scanning_string;
break;
case '{':
if (!scanning_string)
{
+ if (nest_level >= MAXDIM)
+ elog(ERROR, "array_in: illformed array constant");
temp[nest_level] = 0;
nest_level++;
+ if (ndim < nest_level)
+ ndim = nest_level;
}
break;
case '}':
if (!scanning_string)
{
- if (!ndim)
- ndim = nest_level;
+ if (nest_level == 0)
+ elog(ERROR, "array_in: illformed array constant");
nest_level--;
- if (nest_level)
- temp[nest_level - 1]++;
if (nest_level == 0)
- eoArray = done = true;
+ eoArray = itemdone = true;
+ else
+ {
+ /*
+ * We don't set itemdone here; see comments in
+ * ReadArrayStr
+ */
+ temp[nest_level - 1]++;
+ }
}
break;
default:
- if (!ndim)
- ndim = nest_level;
- if (*q == typdelim && !scanning_string)
- done = true;
+ if (*ptr == typdelim && !scanning_string)
+ itemdone = true;
break;
}
- if (!done)
- q++;
+ if (!itemdone)
+ ptr++;
}
temp[ndim - 1]++;
- q++;
- if (!eoArray)
- while (isspace((unsigned char) *q))
- q++;
+ ptr++;
}
for (i = 0; i < ndim; ++i)
dim[i] = temp[i];
int i,
nest_level = 0;
Datum *values;
- char *p,
- *q,
- *r;
+ char *ptr;
bool scanning_string = false;
+ bool eoArray = false;
int indx[MAXDIM],
prod[MAXDIM];
- bool eoArray = false;
mda_get_prod(ndim, dim, prod);
values = (Datum *) palloc(nitems * sizeof(Datum));
MemSet(values, 0, nitems * sizeof(Datum));
MemSet(indx, 0, sizeof(indx));
- q = p = arrayStr;
/* read array enclosed within {} */
+ ptr = arrayStr;
while (!eoArray)
{
- bool done = false;
+ bool itemdone = false;
int i = -1;
+ char *itemstart;
- while (!done)
+ /* skip leading whitespace */
+ while (isspace((unsigned char) *ptr))
+ ptr++;
+ itemstart = ptr;
+
+ while (!itemdone)
{
- switch (*q)
+ switch (*ptr)
{
+ case '\0':
+ /* Signal a premature end of the string */
+ elog(ERROR, "malformed array constant: %s", arrayStr);
+ break;
case '\\':
+ {
+ char *cptr;
+
/* Crunch the string on top of the backslash. */
- for (r = q; *r != '\0'; r++)
- *r = *(r + 1);
+ for (cptr = ptr; *cptr != '\0'; cptr++)
+ *cptr = *(cptr + 1);
+ if (*ptr == '\0')
+ elog(ERROR, "malformed array constant: %s", arrayStr);
break;
+ }
case '\"':
- if (!scanning_string)
- {
- while (p != q)
- p++;
- p++; /* get p past first doublequote */
- }
- else
- *q = '\0';
+ {
+ char *cptr;
+
scanning_string = !scanning_string;
+ /* Crunch the string on top of the quote. */
+ for (cptr = ptr; *cptr != '\0'; cptr++)
+ *cptr = *(cptr + 1);
+ /* Back up to not miss following character. */
+ ptr--;
break;
+ }
case '{':
if (!scanning_string)
{
- p++;
- nest_level++;
- if (nest_level > ndim)
+ if (nest_level >= ndim)
elog(ERROR, "array_in: illformed array constant");
+ nest_level++;
indx[nest_level - 1] = 0;
- indx[ndim - 1] = 0;
+ /* skip leading whitespace */
+ while (isspace((unsigned char) *(ptr+1)))
+ ptr++;
+ itemstart = ptr+1;
}
break;
case '}':
if (!scanning_string)
{
+ if (nest_level == 0)
+ elog(ERROR, "array_in: illformed array constant");
if (i == -1)
i = ArrayGetOffset0(ndim, indx, prod);
+ indx[nest_level - 1] = 0;
nest_level--;
if (nest_level == 0)
- eoArray = done = true;
+ eoArray = itemdone = true;
else
{
- *q = '\0';
+ /*
+ * tricky coding: terminate item value string at
+ * first '}', but don't process it till we see
+ * a typdelim char or end of array. This handles
+ * case where several '}'s appear successively
+ * in a multidimensional array.
+ */
+ *ptr = '\0';
indx[nest_level - 1]++;
}
}
break;
default:
- if (*q == typdelim && !scanning_string)
+ if (*ptr == typdelim && !scanning_string)
{
if (i == -1)
i = ArrayGetOffset0(ndim, indx, prod);
- done = true;
+ itemdone = true;
indx[ndim - 1]++;
}
break;
}
- if (!done)
- q++;
+ if (!itemdone)
+ ptr++;
}
- *q = '\0';
- if (i >= nitems)
+ *ptr++ = '\0';
+ if (i < 0 || i >= nitems)
elog(ERROR, "array_in: illformed array constant");
values[i] = FunctionCall3(inputproc,
- CStringGetDatum(p),
+ CStringGetDatum(itemstart),
ObjectIdGetDatum(typelem),
Int32GetDatum(typmod));
- p = ++q;
-
- /*
- * if not at the end of the array skip white space
- */
- if (!eoArray)
- while (isspace((unsigned char) *q))
- {
- p++;
- q++;
- }
}
/*
retptr = array_seek(arraydataptr, elmlen, offset);
+ *isNull = false;
return ArrayCast(retptr, elmbyval, elmlen);
}
int i,
ndim,
*dim,
- *lb;
+ *lb,
+ *newlb;
int fixedDim[1],
fixedLb[1];
char *arraydataptr;
newarray->ndim = ndim;
newarray->flags = 0;
memcpy(ARR_DIMS(newarray), span, ndim * sizeof(int));
- memcpy(ARR_LBOUND(newarray), lowerIndx, ndim * sizeof(int));
+ /*
+ * Lower bounds of the new array are set to 1. Formerly (before 7.3)
+ * we copied the given lowerIndx values ... but that seems confusing.
+ */
+ newlb = ARR_LBOUND(newarray);
+ for (i = 0; i < ndim; i++)
+ newlb[i] = 1;
+
array_extract_slice(ndim, dim, lb, arraydataptr, elmlen,
lowerIndx, upperIndx, ARR_DATA_PTR(newarray));
*/
int oldlb = ARR_LBOUND(array)[0];
int oldub = oldlb + ARR_DIMS(array)[0] - 1;
- int slicelb = MAX(oldlb, lowerIndx[0]);
- int sliceub = MIN(oldub, upperIndx[0]);
+ int slicelb = Max(oldlb, lowerIndx[0]);
+ int sliceub = Min(oldub, upperIndx[0]);
char *oldarraydata = ARR_DATA_PTR(array);
lenbefore = array_nelems_size(oldarraydata,