Commits · 082f04b5208748be41acca61e1f89290cc686b3e · Sonia Zorba / vollt

Oct 19, 2020
- [ADQL] Fix the "try-fix" feature with regular identifiers. · 082f04b5
  Grégory Mantelet authored Oct 19, 2020
```
Fixes #121
```
  082f04b5
Jul 02, 2019

Revert "[ADQL,TAP] New parser for ADQL-2.1." · f4ffbf1d

Grégory Mantelet authored Jul 02, 2019

This commit reverts commit 89418d13.

The reverted commit will be applied in another branch (probably 'adql-2.1') as
it is part of the next release of ADQL-Lib.

f4ffbf1d

May 10, 2019

[ADQL,TAP] New parser for ADQL-2.1. · 89418d13

Grégory Mantelet authored May 10, 2019

- Now, `ADQLParserFactory.createParser(...)` should be used to create a parser
- Only the new function `LOWER` is supported for the moment
- Not yet possible to manage the optional features _(next dev to come)_
=> 1st step for ADQL-Lib v2.0

- TAP adapted so that using the last stable version of the ADQL language
  (i.e. 2.0 for the moment)
  - but not yet possible to set the ADQL version to use in the configuration
    file

89418d13

Mar 13, 2019

[ADQL] Add to the parser a function attempting to quickly fix an ADQL query. · 15cd5944

Grégory Mantelet authored Mar 13, 2019

This new function - ADQLParser.tryQuickFix(...) - fixes the most common issues
with ADQL queries:

- replace Unicode confusable characters by their ASCII/UTF-8 version,
- double-quote SQL reserved words/terms (e.g. `public`, `year`, `date`),
- double-quote ADQL function names used a column name/alias (e.g. `distance`,
  `min`, `avg`),
- double-quote invalid regular identifiers (e.g. `_RAJ2000`, `2mass`).

The last point is far from being perfect but should work at least for
identifiers starting with a digit or an underscore, or an identifier including
one of the following character: `?`, `!`, `$`, `@`, `#`, `{`, `}`, `[`, `]`,
`~`, `^` and '`'.

It should also been noted that double-quoting a column/table name will make it
case-sensitive. Then, it is possible that the query does not pass even after the
double-quote operation ; the case would have to be checked by the user.

Finally, there is no attempt to fix column and table names (i.e. case
sensitivity and/or typos) using tables/columns list/metadata. That could be a
possible evolution of this function or an additional feature to implement in the
parser.

15cd5944

Mar 21, 2018
- [ADQL] Finally, do not test exactly the weirdly encoded character · 4dea14fc
  gmantele authored Mar 21, 2018
```
Follow up to the commits 33a790a4
and 5e0f82de
```
  4dea14fc
- [ADQL] Complete the previous commit 33a790a4 · 5e0f82de
  gmantele authored Mar 21, 2018
  
  5e0f82de
- [ADQL] Fix character encoding in JUnit test for ADQLParser. · 33a790a4
  gmantele authored Mar 21, 2018
  
  33a790a4
Nov 10, 2017

[ADQL] Fix escaping of double quotes in delimited identifiers. · 239c7178

gmantele authored Nov 10, 2017

A delimited identifier is any sequence of characters between a pair of
double quotes. For instance: "123 I am a delimited identifier!".

It is of course possible to have double quotes inside this kind of identifier,
but they have to be doubled in order to not be mistaken with the end of the
identifier. For instance: "Cool ""identifier""".

However, this escape option was not taken into account by the ADQL library,
though the same mechanism was already in place for string contants.

239c7178

Sep 13, 2017

[ADQL] Also append an HINT message in the ParseException message when a SQL · fe4c3e97

gmantele authored Sep 13, 2017

reserved word is encountered instead of a column/table/schema name/alias.

On the contrary to the previous commit, this time a list of SQL reserved words
has been added into the ADQL grammar. In this way, the parser will ensure that
no word of this list is used in an ADQL query. The raised error is then enriched
of an HINT message stating that this word is part of SQL, is not supported
by ADQL and must be written between double quotes if used as an identifier.

The list of SQL reserved words comes from the ADQL-2.0 standard, after removal
of all potentially used ADQL words, in order to avoid a conflict with the
already existing tokens in the ADQL grammar.

fe4c3e97

[ADQL] Append an HINT message in the ParseException message when an ADQL · db0dfdad

gmantele authored Sep 13, 2017

reserved word is encountered instead of a column/table/schema name/alias.

No list of ADQL reserved words has been added into the ADQL grammar.

However, the ADQL grammar has been slightly changed in order to provide a more
precise location of the REAL wrong part of the query.

Before this commit, if an ADQL reserved word (e.g. 'point') was encountered
outside of its normal syntax (e.g. 'point' no followed by an opening
parenthesis), the next token was highlighted instead of this one. Hence a
confusing error message.

For instance, the following ADQL query:

```sql
SELECT point
FROM aTable
```

returned the following error message:

> Encountered "FROM". Was expecting: "("

Now, it will return the following one:

> Encountered "point". Was expecting one of: "*" <QUANTIFIER> "TOP" [...]
> (HINT: "point" is a reserved ADQL word. To use it as a column/table/schema name/alias, write it between double quotes.)

This error message highlights exactly the source of the problem and even provide
to the user a clear explanation of why the query did not parse and how it could
be solved.

db0dfdad

[ADQL] Allow multiple space characters between ORDER/GROUP and BY keywords. · 993ee846
gmantele authored Sep 13, 2017

993ee846

Sep 11, 2017
- [ADQL] Relax JUnit test on an incorrect character in an ADQL query: · caa7f8be
  gmantele authored Sep 11, 2017
```
with a different local charset, the error message will print differently the
incorrect character.
```
  caa7f8be
Sep 08, 2017

[ADQL] Throwing a ParseException instead of an Error · a382b251

gmantele authored Sep 08, 2017

when an incorrect character that can not be interpreted by
the JavaCC Token Manager is encountered.

Actually, the TokenMgrError thrown by JavaCC is caught by all
ADQLParser.parseQuery(...) functions, wrapped inside a ParseException
which is finally thrown instead of the TokenMgrError. In this way,
ADQL-Lib users just have to care about a single Throwable:
ParseException.

Besides the error message has been slightly modified from:

> Lexical error at line 1, column 10.  Encountered: "\u00e9" (233), after : \"\"

to:

> Incorrect character encountered at l.1, c.10: \"\\u00e9\" ('é'), after : \"\"

Thus, the error is more user-friendly, more easy to understand by users.
Additionally, the incorrect character is displayed, as before, in its unicode
expression, but also in its character form (instead of an integer value that
nobody can really understand).

This commit fixes the GitHub issue #17

a382b251

Apr 03, 2017
- [ADQL] Re-Fix GROUP BY's columns handling: · 7a70c603
  gmantele authored Apr 03, 2017
```
a qualified column name should be allowed, but still no column index should be.
```
  7a70c603
Mar 02, 2017
- [ADQL] Fix Error when a query ends by a comment with no ending new line. · 9bc530cd
  gmantele authored Mar 02, 2017
  
  9bc530cd
Sep 20, 2016

[ADQL] Fix the tree generated by the parsing of NATURAL JOINs. · 7ca49f81

gmantele authored Sep 20, 2016

The "normal" JOIN:
    A JOIN B ON A.id = B.id JOIN C ON B.id = C.id
is correctly interpreted as:
    ( (A JOIN B ON A.id = B.id) JOIN C ON B.id = C.id )
But with a NATURAL JOIN, the tree is mirrored:
    A NATURAL JOIN B NATURAL JOIN C
gives:
	( A NATURAL JOIN (B NATURAL JOIN C) )
instead of:
    ( (A NATURAL JOIN B) NATURAL JOIN C )
This is not a problem when the SQL translation is identical to the ADQL
expression, but for some DBMS a conversion into a INNER JOIN ON is necessary
and in this case we got the following SQL:
    A JOIN B JOIN C ON A.id = B.id ON B.id = C.id
Which seems to work, but is syntactically strange.

This commit should fix the generated tree. A "normal" JOIN and a NATURAL JOIN
should now have the same form. A JUnit test has been added into TestADQLParser
to check that: testJoinTree().

7ca49f81

Jul 13, 2016

[ALL] Restore some sleeping JUnit tests + Allow a reset of custom types · 2c79eb6f

gmantele authored Jul 13, 2016

in DBType.DBDatatype (for UNKNOWN and UNKNOWN_NUMERIC). This reset is performed
after each JUnit setting a special custom value (otherwise it prevents other
JUnit to run correctly)

2c79eb6f

Apr 20, 2016
- [ADQL] Adapt the JUnit test case for ADQLParser according to the last commit. · 9a0f1022
  gmantele authored Apr 20, 2016
  
  9a0f1022
Mar 04, 2016

[ADQL] Set a type to a query's resulting column when it is not originally a column. · 0003e343

gmantele authored Mar 04, 2016

This is easily possible for concatenations, string constants and User Defined
Functions having a FunctionDef. A new special datatype was needed for
numeric functions and operations: UNKNOWN_NUMERIC. This special type
can not be set with FunctionDef.parse(...) and it behaves exactly like the type
UNKNOWN, except that DBType.isNumeric() returns true (as .isUnknown()).
Thus, while writing the metadata of a result in TAP, nothing changes:
an UNKNOWN_NUMERIC type will be processed similarly as an UNKNOWN type:
to use the type returned from the database ResultSet or to set VARCHAR.
(no modification of TAP was needed for that)

0003e343

Sep 01, 2015

[ADQL,TAP] Fix bug (reported by G. Landais) in the understanding of UNKNOWN · 271e03cc

gmantele authored Sep 01, 2015

types. The notion of "unknown type" is different in function of the target
object:
  - a DBType and a FunctionDef have an unknown type if their function
    isUnknown() returns true. In such case, the other functions such as
	isNumeric/String/Geometry() will return false.
  - an ADQLOperand (e.g. ADQLColumn) does NOT have a isUnknown() function.
    But if the type of the operand is unknown, its functions isNumeric(),
	isString() and isGeometry() must ALL return true. Otherwise, just one of
	these functions can return true.

271e03cc

Aug 27, 2015

[ADQL] Fix a Big Bug reported by M.Taylor and M.Demleitner: in ORDER BY, GROUP... · 13a2dc54

gmantele authored Aug 27, 2015

[ADQL] Fix a Big Bug reported by M.Taylor and M.Demleitner: in ORDER BY, GROUP BY and USING only regular and delimited identifiers are accepted, not qualified column names.
For instance: "SELECT table.column_name FROM table ORDER BY table.column_name" is wrong. We should instead write:
"SELECT table.column_name FROM table ORDER BY column_name".
"SELECT table.column_name AS mycol FROM table ORDER BY mycol" is also correct.
Of course, for ORDER BY and GROUP BY, it is still possible to reference a column using its index in the SELECT clause.
For instance: "SELECT table.column_name FROM table ORDER BY 1".

13a2dc54

Jul 20, 2015
- [ADQL,TAP] Add the ability to declare columns and UDF with an UNKNOWN type. · 37e26a11
  gmantele authored Jul 20, 2015
```
(merge with branch 'unknownFctType')
```
  37e26a11
Jun 16, 2015
- [ADQL] Fix the ADQL DEBUG mode ; now ADQLParser.setParser(boolean) is doing... · dc766f2c
  gmantele authored Jun 16, 2015
```
[ADQL] Fix the ADQL DEBUG mode ; now ADQLParser.setParser(boolean) is doing what it is supposed to do.
```
  dc766f2c
Jun 08, 2015
- Merge branch 'master' into objectPosition · 843c9960
  gmantele authored Jun 08, 2015
  
  843c9960
Oct 28, 2014

[ADQL,TAP] Add STC-S and UDFs support in the ADQL parser. Now, it is possible... · 496e769c

gmantele authored Oct 28, 2014

[ADQL,TAP] Add STC-S and UDFs support in the ADQL parser. Now, it is possible to provide a list of allowed UDFs, regions and coordinate systems. The ServiceConnection of TAP is now able to provide these lists and to propagate them to the ADQLExecutor. UDFs and allowed regions are now listed automatically in the /capabilities resource of TAP. The type 'geometry' is now fully supported in ADQL. That's why the new function 'isGeometry()' has been added to all ADQLOperand extensions. Now the DBChecker is also able to check roughly types of columns and UDFs (unknown when parsing syntactically a query). The syntax of STC-S regions (expressed in the REGION function) are now checked by DBChecker. However, for the moment, geometries are not serialized in STC-S in the output....but it should be possible in some way in the next commit(s).

496e769c