Query Functions - TIBCO ActiveSpaces®

TIBCO ActiveSpaces supports the use of aggregate functions , date and time functions, and string functions.

Date and Time Functions

ActiveSpaces provides support for several date and time functions which can be used in queries.

The date and time functions return either a string or a double. For information about running queries which reference a datetime column, see Querying tibDateTime Columns. The following is the list of date and time functions supported.

Date and Time Function Returns

date(timevalue[, modifier[, modifier, …]]) The date as a string in the format YYYY-MM-DD time(timevalue[, modifier[, modifier, …]]) The time as a string in the format HH:MM:SS datetime(timevalue[, modifier[, modifier, …]]) The date and time as a string in the format

YYYY-MM-DD HH:MM:SS

TIBCO ActiveSpaces^® Concepts

Date and Time Function Returns

julianday(timevalue[, modifier[, modifier, …]]) The number of days since noon in Greenwich on November 24, 4714 B.C as a double.

The date and time functions can act on the values of columns which have been defined as one of the following data types: ^string, ^double, ^long, and ^datetime. The following table shows how each of the data types represent date and time:

Data Type Representation of Date and Time

string ISO_8601 strings. For example, "YYYY-MM-DD

HH:MM:SS.SSS"

double Julian day numbers, the number of days since

noon in Greenwich on November 24, 4714 B.C.

according to the proleptic Gregorian calendar

long (default) as Julian day numbers. See type double

in the earlier row.

long Unix Time, the number of seconds since

1970-01-01 00:00:00 UTC

datetime two 64-bit integers; one for the number of

seconds since January 1, 1970 (Unix epoch), and one for the number of nanoseconds after the time that the ^sec component denotes.

By default, a column of type long is treated as a Julian day number. A column of type long can also be used to hold Unix time values (e.g. the number of seconds since 1970-01-01 00:00:00 UTC). When using a date and time function on a column of type long that holds Unix time values, the ‘unixepoch’

modifier must follow the column name in the parameters passed to the function otherwise the column value is interpreted as a Julian day. See Modifiers.

The following are examples of using the date 2016-12-24 00:00:00 as values in the different column types:

● string - "2016-12-24T00:00:00". For details, see timevalue Format.

● double - 2457746.50000

● long - 1482566400 (Unix time)

● datetime - [1482566400, 0]

The date and time functions can be used in the WHERE clause of SELECT statements and in the select list of SELECT statements. For example,

SELECT * FROM mytable WHERE date(dtmcol)='2016-12-24' SELECT date(dtmcol) FROM mytable

timevalue Format

The timevalue parameter represents a date/time in a format understood by the date and time functions.

timevalue can be in any of the following formats:

1. YYYY-MM-DD

2. YYYY-MM-DD HH:MM

When timevalue is the name of a column in a table, the value of the column is substituted for the timevalue parameter. As per ISO-8601, a date and time can be combined using the literal character ^T. The^T can also be omitted by mutual agreement. Both ways of combining dates and times into a single string are supported.

Formats 2 through 10 can optionally be followed by a timezone indicator. The timezone indicator can have the following formats:

● [+-]HH:MM

● Z

Z represents UTC time and is the timezone used to store dates and times in the data grid. If HH:MM is non-zero, it is subtracted from the date and time and is intended to be used to convert to UTC time.

The fractional seconds value SS.SSS can have one or more digits following the decimal point but only the first three digits are considered. Therefore, support for full nanosecond precision of tibDateTime values stored into ActiveSpaces datetime columns must not be expected for queries.

Format 11 is the Julian day number expressed as a long or floating point number. This format can accept any number of digits as required to represent the Julian day number. Format 11 can also be a

long which represents Unix time. By default ^long values are interpreted as Julian days. The ^unixepoch modifier must follow the column name in the parameters passed to the date and time functions for the value of the column to be interpreted as Unix time. See the section Modifiers.

now is converted into the current date and time. However as each node of the data grid executes the query looking for rows to return, the use of ^now must be carefully analyzed to ensure the query returns the results you expect.

The SQL variables CURRENT_DATE, CURRENT_TIME, and CURRENT_TIMESTAMP can be used instead of specifying date('now'), time('now'), and datetime('now'), respectively.

Modifiers

Each date and time function accepts zero or more modifier parameters which can be used to alter the date or time returned by the function. Modifiers are applied from left to right in the order specified.

The following modifiers can be used to add to the date/time:

● NNN day[s]

● NNN hour[s]

● NNN minute[s]

● NNN.NNNN second[s]

TIBCO ActiveSpaces^® Concepts

● NNN month[s]

● NNN year[s]

The following modifiers shift the date or time backward:

● start of month

● start of year

● start of day

The weekday modifier can be used to shift the date forward to the next date when the weekday number is N. Sunday starts at 0, Monday is 1, and so on.

● weekday N

The following modifiers can be used to convert the timevalue immediately preceding it to something else. For each of these modifiers, the results are undefined if the timevalue preceding it is not of the proper type.

● unixepoch

● localtime

● utc

The unixepoch modifier causes the timevalue preceding it to be interpreted as Unix Time (the number of seconds since January 1, 1970). The localtime and utc modifiers can be used to convert the timevalue immediately preceding it from UTC time to localtime or localtime to UTC time respectively.

Function Usage on Index Columns

When using date and time functions on primary key or secondary index columns in a WHERE clause or table iterator filter string, the application of the function to the column, in the left side of an

equation, does not reduce the number of rows being examined for a query. Indexes are calculated based on the actual value of a column and not the value that is the result of conversion by functions.

You can use the date and time functions in comparisons with key or index fields and still get the benefit of the key or index, if you use the function on the right-hand side of the equation. For example if the

dtm column is of type datetime (holds tibDateTime objects) and dtm is defined as a secondary index, the following query only scans those rows with a date less than '2018-06-01 00:00:00':

SELECT * FROM mytable WHERE dtm < datetime('2018-06-01')

WHERE Clause Examples

Using the date() function always returns a date of the form:

● YYYY-MM-DD Where:

YYYY [0000-9999]

MM [01-12]

DD [01-31]

Given a table with the following columns and values for 2016-12-24 00:00:00:

Columns Data Type Values

dtm datetime [1482566400, 0]

Columns Data Type Values

dtmstr string “2016-12-24T00:00:00”

dtmlong long 1482566400

julianday double 2457746.50000

Assuming all columns of a row are loaded with a date or time value that contains the date 2016-01-01, the following are examples of using the Date() function in a WHERE clause:

SELECT * FROM mytable WHERE date(dtm)=’2016-12-24’

SELECT * FROM mytable WHERE date(dtmstr)=’2016-12-24’

SELECT * FROM mytable WHERE date(dtmlong,’unixepoch’)=’2016-12-24’

SELECT * FROM mytable WHERE date(julianday)=’2016-12-24’

Result Column Examples

The date and time functions supported by ActiveSpaces can be used in the list of result columns for a SELECT statement. The date and time functions cannot be used with table iterators.

Given a table with the following columns and values for 2016-12-24 00:00:00:

Columns Data Type Values

dtm datetime [1482566400, 0]

dtmstr string “2016-12-24T00:00:00”

dtmlong long 1482566400

julianday double 2457746.50000

The following are examples of SELECT statements which use the date and time functions in the select list of the SELECT statement:

SELECT key, date(dtm) FROM mytable WHERE key<=10 SELECT key, datetime(dtm) FROM mytable WHERE key<=10 SELECT key, time(dtm) FROM mytable WHERE key<=10 SELECT key, julianday(dtm) FROM mytable WHERE key<=10

SELECT key, datetime(now), date(dtm) FROM mytable WHERE key <=10

tibDateTime

Date and time information can be stored in ActiveSpaces as date and time strings, ^{Julian Day}^long or

double values, or as tibDateTime values. tibDateTime consists of two 64-bit integers; one for the number of seconds since January 1, 1970 (Unix epoch), and one for the number of nanoseconds after the time that the sec component denotes.

ActiveSpaces tables can be defined, using ^tibdg , with columns of type ^datetime which are mapped internally to a tibDateTime object for storing data into rows of the table. The following is the C API for setting and retrieving tibDateTime objects into/from the rows of a table:

void tibdgRow_SetDateTime(tibEx e, tibdgRow row, const char *columnName, const tibDateTime *value)

tibDateTime* tibdgRow_GetDateTime(tibEx e, tibdgRow row, const char* columnName)

tibDateTime columns can be used for primary keys and for secondary indexes.

ActiveSpaces does not support time zone information to go along with any date and time values.

TIBCO recommends that you always store dates and times as UTC time.

TIBCO ActiveSpaces^® Concepts

Populating tibDateTime

Windows and Unix platforms use different structures for retrieving date/time data. The following are examples of how you can get the current date and time on each platform and populate a tibDateTime object:

dt.nsec = timebuffer.tv_usec * 1000;

Windows

dt.nsec = timebuffer.millitm * 1000000;

The ActiveSpaces API can then be used to store the tibDateTime object into a row which is then stored into a table in the data grid.

Querying tibDateTime Columns

To perform a SQL query on a tibDateTime column, the value of the tibDateTime column is internally converted to a string of the form:

YYYY-MM-DD HH:MM:SS.SSSSSSSSSZ

Where the ' SSSSSSSSS ' following ' SS. ' represent nanoseconds and 'Z' represents Coordinated Universal Time (UTC).

Since we do this internal conversion of tibDateTime column values, the SQL date and time functions which normally only work on ISO 8601 formatted strings can also be used on the converted

tibDateTime values. The date and time functions are lenient with respect to the number of digits following the decimal in the fractional seconds, so the nine places of precision in the string representation of a tibDateTime value does not cause any issues.

The following are examples of queries which can be run on a column of type datetime which contains a tibDateTime object:

SELECT * FROM mytable WHERE dtm=’2016-01-01 00:00:00.000000000Z’

SELECT * FROM mytable WHERE date(dtm)=’2016-01-01’

SELECT * FROM mytable WHERE datetime(dtm)=’2016-01-01 00:00:00’

Error Conditions

Using a date or time function on a column which does not contain an appropriate date or time value does not cause an exception. In most cases, an SQL NULL is the result of the function and the query is processed accordingly.

SQL Aggregate Functions

TIBCO ActiveSpaces supports the use of aggregate functions as output columns in queries.

The aggregate functions supported are listed in the following table:

Function Description

AVG(<column>) Computes the arithmetic mean of the non-NULL values in the column. The type of the result is always a Double, regardless of the data type of the underlying column. If the data type of the underlying column is String, the value is converted to a Double. If the data type of the underlying column is Opaque, the data is treated as a String and then converted to a Double.

COUNT(<column>) COUNT(*) Computes the number of non-NULL values in the column, or the number of rows. The result is always of type Long.

COUNT() is not supported.

MIN(<column>) The minimum non-NULL value in the column. The type of the result matches the type of the underlying column.

MAX(<column>) The maximum non-NULL value in the column. The type of the result matches the type of the underlying column.

SUM(<column>) Computes the sum of the non-NULL values in the column.

The type of the result is Long if the data type of the

underlying column is Long, otherwise it is Double. If the data type of the underlying column is String, the value is

converted to a Double. If the data type of the underlying column is Opaque, the data is treated as a String and then converted to a Double.

GROUP_CONCAT( DISTINCT

<column>) Concatenates the unique values for a column in the group into a comma separated list. The DISTINCT keyword is required. The order of the values in the list is not defined.

The type of the result is always a String, irrespective of the data type of the underlying column. If the data type of the underlying column is Long or Double, the value is converted to a String. If the data type of the underlying column is Opaque, the data is treated as String.

Aggregate functions can only be applied to a single column and cannot be applied to expressions or functions. For example, the following functions are not valid:

AVG(col1 + col2) MIN(trim(col1))

Whereas the following function is valid:

AVG(col1)

If all of the underlying values are NULL or cannot be converted to the appropriate data type for the function, the value of the function is NULL.

Using Aggregate Functions with the WHERE Clause

If the query includes a WHERE clause, the aggregate functions are applied to the rows that match the clause. If no WHERE clause is included, the functions are applied to all the rows in the table.

TIBCO ActiveSpaces^® Concepts

Using Aggregate Functions with the GROUP BY Clause

ActiveSpaces supports the GROUP BY clause and results in the aggregate functions being applied to the rows that match unique combinations of the GROUP BY columns. For example, the following query returns a set of rows - one for each state, with each row containing two columns, the state, and the mean salary for employees in that state:

SELECT state, AVG(salary) FROM employees GROUP BY state

If the query includes a GROUP BY clause, all the columns in the GROUP BY clause must be included as output columns. For instance, the following query is not valid because state is not one of the output columns:

SELECT AVG(salary) FROM employees GROUP BY state

The GROUP BY clause can only be applied to a single column and cannot be applied to expressions or functions. The following code snippet is an invalid usage of the GROUP BY clause:

SELECT date(dtm) AS Year,COUNT(dtm) FROM t1 GROUP BY Year

What Happens when the Query Contains Both, the WHERE and the GROUP BY Clause

If the query contains both a WHERE clause and a ^{GROUP BY} clause, the ^WHERE clause must precede the

GROUP BY clause.

As is the case with other TIBCO ActiveSpaces queries, the order of the rows in the result set is not defined unless the ^{ORDER BY} clause is used.

On large data sets, calculating the aggregate results can take a long time because large numbers of rows need to be processed before the computation can be completed. In such cases, when creating or

executing the statement, it is advisable to set the

TIBDG_STATEMENT_PROPERTY_DOUBLE_FETCH_TIMEOUT property accordingly.

The supported date-time functions can be used as output columns in the WHERE clause but not in the GROUP BY clause. Applications that send a large number of queries can choose a proxy binding strategy to distribute the queries across the proxies available to you.

The Usage of Indexes When the Query Contains Aggregation

Indexes can be used for a column that is an argument to a calculated function, or to GROUP BY, if the column has a constraint within the WHERE clause. For example, the following queries use the last_name_idx:

SELECT last_name, COUNT(last_name) AS count_last_name FROM t1 WHERE last_name LIKE 'B%' GROUP BY last_name

SELECT COUNT(last_name) AS count_last_name FROM t1 WHERE last_name LIKE 'B%

'SELECT last_name FROM t1 WHERE last_name LIKE 'C%' GROUP BY last_name

But the same queries without the WHERE constraints perform a full table scan.

Limitations

The following clauses are not supported:

● HAVING

● DISTINCT (Except for GROUP_CONCAT( DISTINCT <column>))

● ALL

The following keywords cannot be passed as an argument to the aggregate functions:

● DISTINCT (Except for GROUP_CONCAT( DISTINCT <column>))

● ALL

For example, the following statements are invalid:

SELECT COUNT(DISTINCT city) FROM t1 SELECT COUNT(ALL city) FROM t1

Floating Point Calculations in Aggregate Functions

By default, floating point calculations are inexact and the order in which they are carried out can subtly affect the results. When executing queries involving aggregate functions, ActiveSpaces processes the data from the different copysets in parallel leading to slight variations in the results of floating point calculations. While the variations might be very small, they can be amplified by rounding the decimal value. For example, the average of a column can be calculated as 80.2849999999999966 one time and 80.2850000000000108 another time. This variation is clearly very small (1.4e-14) but if the results are rounded to 2 decimal places, that is 80.28 and 80.29, the variation appears greater than that.

SQL String Functions

TIBCO ActiveSpaces includes the following SQL String functions:

Function Description

|| (concat operator) Concatenates two strings and returns a single string. The ^|| operator joins together the two strings of its operands.

char(X1,X2,...,XN) Returns a string composed of the characters represented by the Unicode code points specified by X1, X2, and so on.

instr(X,Y) Finds the first occurrence of string Y in string X. If a match is found, the function returns a Long value, which is the starting position of string Y in string X. Otherwise, it returns 0. If either X or Y is NULL, then the result is NULL, which in SQL means no value is returned. Remember that in SQL, the first character is considered position 1 and not 0.

For example:

SELECT instr(FIRST_NAME, 'jo') FROM t1

If the ^FIRST_NAME is 'john', the function returns the starting position of the string "jo " in the string "john". In this case, the function returns 1.

ltrim(X) Returns a string formed after removing space characters, if any, from the left of string X.

ltrim(X.Y) Returns a string formed by removing the characters that appear in Y from the beginning of X. For example,

SELECT ltrim(FIRST_NAME, 'j') FROM t1

In this case the function returns "ohn" since it removes the 'j' character from the left of the string.

rtrim(X) Returns a string formed after removing space characters, if any, from the right of string X.

rtrim(X.Y) Returns a string formed by removing the characters that appear in Y from the end of X.

SELECT rtrim(FIRST_NAME, 'n') FROM t1

In this case the function returns "joh" if the value in the column was

"john".

TIBCO ActiveSpaces^® Concepts

Function Description

trim(X) Returns a string formed after removing space characters, if any, from both sides of string X.

trim(X,Y) Returns a string formed by removing the characters that appear in Y from the beginning and the end of X.

SELECT trim(FIRST_NAME, 'n') FROM t1

In this case the function returns "atha" if the value in the column was

"nathan".

lower(X) Returns a string formed after converting the characters in string X to lower case.

upper(X) Returns a string formed after converting the characters in string X to upper case.

length(X) Returns a long that is the length of the string X. If X is NULL, the function

在文檔中 TIBCO ActiveSpaces® (頁 41-50)