CREATE TABLE t1(x TEXT);
INSERT INTO t1(x) VALUES('This field
contains a new-line character');
The text field contains a newline character (or maybe a newline and a
carriage-return if you typed the above on a Windows machine). If you
then using the ".dump" command of the CLI to reconstruct the database,
like this:
sqlite3 t1.db .dump | sqlite3 t1-copy.db
The newline and/or carriage-return characters might not make it
through the translation process, depending on your system and how it
is set up to handle end-of-line character translations. The SQLite
CLI could be improved to be cleaner about this, but it depends on
external utilities such as readline() or linenoise() whose handling of
end-of-line characters might not be consistent across all platforms,
so there are limits to what can be done there.
One possible work-around is to encode end-of-line characters specially
to ensure they survive a ".dump" and restore. This is done in a
branch. (https://www.sqlite.org/src/timeline?r=string-quoting-dump).
The output of the above database looks like this:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE t1(x);
INSERT INTO t1 VALUES('This field'||char(10)||'contains a new-line character');
COMMIT;
The newline character gets translated into a char(10) function.
Question: Is this a desirable feature? Should it be merged to trunk?
Advantages: The .dump output is become immune to the irregularities
in end-of-line handling across systems.
Disadvantages: (1) The .dump output is no longer portable to EBCDIC
systems. (2) The output is more complex. (3) Not all SQL database
engines support the char() function or the || concatenation operator,
which makes the .dump output less portable. (4) The dump becomes
harder to read.
Mitigating Factors: This only comes up for databases that have text
fields with embedded end-of-line characters. My sense is that such
databases are uncommon.
Your feedback is appreciated.
--
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-dev mailing list
sqlit...@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-dev
I think it is reasonable to merge. Even if it is not 100% portable to other SQL engines, I think the greater utility is in being able to round trip a dump file back into SQLite reliably. Also, converting a control code into a "visible" character sequence could make further transformations on the dump file easier, even for EBCDIC systems.Could it be done in the same way with X'##' sequences?
> The SQLite CLI could be improved to be cleaner about this, but it
> depends on external utilities such as readline() or linenoise()
Could you expand on that a bit, please? While I'm sure you've tested
it, the assertion indicts every application that uses readline. Doesn't
readline work at the terminal interface, and pass input transparently
if standard input is a file?
If you decide you don't want to pass newlines verbatim in a dump, I
suggest using vis(3) from BSD. Ubuntu packages it as libbsd-dev. The
output is reversible. There is a simple utility to restore the
newlines for systems that require them, or for visual inspection. (I
haven't been able to find it packaged, but it's easy enough to build.
Cf. http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/vis/.)
Using libvis:
sqlite3 t1.db .dump | sqlite3 t1-copy.db
Just Works, as on your branch.
sqlite3 t1.db .dump | unvis | less
shows the output with newlines, as it appears today.
sqlite3 t1.db .dump | unvis | some-db-utility
can read the output regardless of its support for char() or ||.
--jkl
Readline() strips \n and \r from the end of each input line, does it
not? How is SQLite suppose to know if it was just a \n that was
stripped or both a \n and a \r?
--
D. Richard Hipp
d...@sqlite.org
That approach only works for UTF8. If you have a UTF16 database, it
breaks down. The use of char(), on the other hand, works for both.
On 3/11/17, Scott Robison <sc...@casaderobison.com> wrote:That approach only works for UTF8. If you have a UTF16 database, it
>
> Yes, the same could be done with X'##' sequences. At least this:
>
> select typeof('asdf'||X'0A'||'asdf')
>
> yields 'text' as the result.
breaks down. The use of char(), on the other hand, works for both.
> >> The SQLite CLI could be improved to be cleaner about this, but it
> >> depends on external utilities such as readline() or linenoise()
> >
> > Could you expand on that a bit, please?
>
> Readline() strips \n and \r from the end of each input line, does it
> not? How is SQLite suppose to know if it was just a \n that was
> stripped or both a \n and a \r?
A quick test suggests that's not exactly what's happening. If readline
really is interfering, though,I think a better answer is go around
it when there's no tty.
My sqlite3 is not linked to readline, and strips out \r. Here are bash
and sqlite3:
$ printf '1\r\n2\r\n3\r\n' | bash -c 'while read f; do echo $f; done' |
cat -v
1^M
2^M
3^M
Bash uses readline, and reproduces \r faithfully.
$ printf "select '1\r\n2\r\n3\r\n';" | sqlite3 | cat -v
1
2
3
Where'd they go? (I'm using Ubuntu.)
Just to verify, no readline in my sqlite3:
$ ldd $(which sqlite3) | sed -E 's/>.+$/>/'
linux-vdso.so.1 =>
libdl.so.2 =>
libpthread.so.0 =>
libc.so.6 =>
/lib64/ld-linux-x86-64.so.2 (0x00007fd59113a000)
It looks to me that carriage returns are being lost without any help
from readline.
But, if readline is interfering on some platforms, sqlite3 isn't
obliged to use it to read from standard input when there's no keyboard
attached to it. Isn't it simpler to test with isatty(3), and use fgets
if it returns 0?
--jkl