[Lazarus] Tests results of several pascal based JSON parsers

classic Classic list List threaded Threaded
47 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
On Fri, Aug 30, 2019 at 9:09 PM Bart <[hidden email]> wrote:

> On Windows it prints FALSE, both with 3.0.4 and trunk r42348

It fails on both comparisons (hexadecimal representation of the
returned unicodestrings):

Name    : 004A 006F 0065 003F 0053 0063 0068 006D 006F 0065
Expected: 004A 006F 0065 00AE 0053 0063 0068 006D 006F 0065
00AE is replaced by 003F (a questionmark IIRC)

Occupation: 0062 0061 006E 006B 0020 0074 0065 006C 006C 0065 0072
0020 003F 0020
Expected: 0062 0061 006E 006B 0020 0074 0065 006C 006C 0065 0072 0020 00AE 0020

Same replacement there.

fpc trunk r42348.

--
Bart
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
For those tracking the unicode issue, could you please verify the problem does not present in my JsonTools library on compiler revisions and platforms? I always get true (passed) with my library, but not with any other library. Here is the relevant test:

function VerifyUnicodeChars: Boolean;
const
  UnicodeChars = '{ "name": "Joe®Schmoe", "occupation": "bank teller \u00Ae " }';
var
  N: TJsonNode;
begin
  N := TJsonNode.Create;
  N.Parse(UnicodeChars);
  Result := (N.Child(0).AsString = 'Joe®Schmoe') and (N.Child(1).AsString = 'bank teller ® ');
  N.Free;
end;

begin
  WriteLn('Handles unicode characters correctly: ', VerifyUnicodeChars);
end.

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list


On Fri, 30 Aug 2019, Anthony Walter via lazarus wrote:

> I am not sure how under any situation parsing a JSON from a stream source
> would be any faster than parsing a string.

If you would check the fpjson code, you'd see why.
You'd also see why there is plenty of room for improvement.

> Also with regards to timing I am
> not sure how accurate Now is. For this purpose I've written:

I agree that for time measurements of 10ms or less, you should not use Now
for measurements.

But over a total timespan of 2.9 seconds, now is plenty accurate.


Michael.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list


On Fri, 30 Aug 2019, Bart via lazarus wrote:

> On Fri, Aug 30, 2019 at 9:09 PM Bart <[hidden email]> wrote:
>
>> On Windows it prints FALSE, both with 3.0.4 and trunk r42348
>
> It fails on both comparisons (hexadecimal representation of the
> returned unicodestrings):
>
> Name    : 004A 006F 0065 003F 0053 0063 0068 006D 006F 0065
> Expected: 004A 006F 0065 00AE 0053 0063 0068 006D 006F 0065
> 00AE is replaced by 003F (a questionmark IIRC)
>
> Occupation: 0062 0061 006E 006B 0020 0074 0065 006C 006C 0065 0072
> 0020 003F 0020
> Expected: 0062 0061 006E 006B 0020 0074 0065 006C 006C 0065 0072 0020 00AE 0020
>
> Same replacement there.

Can you try setting defaultsystemcodepage to UTF8 ?

Michael.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list


On Fri, 30 Aug 2019, Anthony Walter via lazarus wrote:

> Okay, so I turned on my Windows VM with a different version of FPC and ran
> VerifyUnicodeChars with both FPJson and JsonTools. The resutls are the
> same. JsonTools sees the unicode correctly, and something is wrong when
> using FPJson. I don't know what the problem is, but other people are
> noticing similar issues, so it would seem there is definitely a problem
> resulting in a failure for FPJson.
>
> Michael, you have all the information needed to find out what's wrong and
> I'd be curious to learn why it's not working.

Allow me to correct you, I don't have all information:

1. Did you run my provided test program on linux ?
    (the first one I sent, not the one for the speed test)

    If my test program also shows different results on your machine,
    indeed something strange is going on.

    The reason I insist on the us of my test program and not yours,
    is that the result can be influenced by some unknown units or whatnot in yours,
    and my program is "bare bones".

    On the assumption my private computer can be somehow 'compromised' by years
    of FPC development I even copied my program to 2 other linux machines,
    used for production work, and the result is 'True' on both.
    (see log below for one of them, this is with standard ubuntu installed compiler)

    I can believe the program would output something different on Windows,
    but not on another linux box.

2. Where is the source code of your test program(s) ?
    At least the ones for fpjson and jsontools.

    Without your actual source code, I cannot give advice or investigate properly.

I'd like to see this cleared up. fpjson has been in use in many REST
services in large production sites for at least 8 years, these services
definitely use UTF8 content outside the basic ASCII or even ANSI codepages.

So the failures you see are highly surprising to me, to say the least.

Michael.

Copy&paste from a quick sompile session.
----
Welcome to Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-34-generic x86_64)

  * Documentation:  https://help.ubuntu.com
  * Management:     https://landscape.canonical.com
  * Support:        https://ubuntu.com/advantage

  * Keen to learn Istio? It's included in the single-package MicroK8s.

      https://snapcraft.io/microk8s

  * Canonical Livepatch is available for installation.
    - Reduce system reboots and improve kernel security. Activate at:
      https://ubuntu.com/livepatch
   _____
  / ___/___  _  _ _____ _   ___  ___
| |   / _ \| \| |_   _/ \ | _ )/ _ \
| |__| (_) | .` | | |/ _ \| _ \ (_) |
  \____\___/|_|\_| |_/_/ \_|___/\___/

Welcome!

This server is hosted by Contabo. If you have any questions or need help,
please don't hesitate to contact us at [hidden email].

Last login: Tue Aug 20 14:23:39 2019 from 81.82.199.218
root@vmi203569:~# fpc twalter.pas -S2
Free Pascal Compiler version 3.0.4+dfsg-18ubuntu1 [2018/07/02] for x86_64
Copyright (c) 1993-2017 by Florian Klaempfl and others
Target OS: Linux for x86-64
Compiling twalter.pas
Linking twalter
/usr/bin/ld.bfd: warning: link.res contains output sections; did you forget -T?
26 lines compiled, 0.4 sec
root@vmi203569:~# ./twalter
Handles unicode chars correctly: >{ "name": "Joe®Schmoe", "occupation": "bank teller \u00Ae " }<
TRUE
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
Michael, regarding this unicode problem, all the code has already been posted in this thread.

program Test;

uses
  FPJson, JsonParser, JsonTools;

const
  UnicodeChars = '{ "name": "Joe®Schmoe", "occupation": "bank teller \u00Ae " }';

function VerifyUnicodeCharsFPJson: Boolean;
var
  N: TJSONData;
begin
  N := GetJSON(UnicodeChars);
  Result := (N.Items[0].AsUnicodeString = 'Joe®Schmoe') and (N.Items[1].AsUnicodeString = 'bank teller ® ');
  N.Free;
end;

function VerifyUnicodeCharsJsonTools: Boolean;
const
  UnicodeChars = '{ "name": "Joe®Schmoe", "occupation": "bank teller \u00Ae " }';
var
  N: TJsonNode;
begin
  N := TJsonNode.Create;
  N.Parse(UnicodeChars);
  Result := (N.Child(0).AsString = 'Joe®Schmoe') and (N.Child(1).AsString = 'bank teller ® ');
  N.Free;
end;

begin
  WriteLn('FPJson Handles unicode chars correctly: ', VerifyUnicodeCharsFPJson);
  WriteLn('JsonTools Handles unicode chars correctly: ', VerifyUnicodeCharsJsonTools);
end.                              

Output:

FPJson Handles unicode chars correctly: FALSE
JsonTools Handles unicode chars correctly: TRUE

Tested on both Linux and Windows with the same results. Differing versions of FPC on differing platforms and other people have verified the same result. Try the tests yourself. Maybe you can figure out what's going wrong.

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
If there is any chance the char codes are being altered through whatever browser / mail client you are using, here is a direct link to the program source:


--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
Michael, I hadn't tried your example code yet as I thought the discussion was on the topic of the unicode failure, and your example was about parsing speed. I'll be happy to take a look at speed improvements, but like you I am interested to find our what's failing with VerifyUnicodeCharsFPJson.

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list


On Sat, 31 Aug 2019, Anthony Walter via lazarus wrote:

> Michael, regarding this unicode problem, all the code has already been
> posted in this thread.


>
> program Test;
>
> uses
>  FPJson, JsonParser, JsonTools;

There you are. You're missing the cwstring unit and the codepage directive.

Change the above code to

{$mode objfpc}
{$h+}
{$codepage utf8}
uses
   {$IFDEF UNIX}cwstring, {$ENDIF} FPJson, JsonParser, JsonTools;

and it will work correctly.
(The objpas and $h+ are probably in your fpc.cfg or lazarus setup)

Your program will only work correctly in a utf8-only environment.
(see also below)

But fpJSON relies on the fpc infrastructure to handle all codepages, as a
consequence this infrastructure also must be set up properly.

Now you can see why I insisted on using my program, it was known to work
correctly: it sets up things properly. If you look at my initial mail,
you'll also see that I explicitly mentioned including cwstring.
You probably failed to pick up on that important piece of info.

So, mystery solved.

That said :

Unfortunately JSONTools is also not without problems.

I copied the program to a windows VM.

Attached screenshot of the output.

As you can see, jsontools also does not work correctly.

It's no mystery why not. I had to add

   DefaultSystemCodePage:=CP_UTF8;

as the first line in the program, then it does show TRUE for both tests.
Now, if you work in lazarus, it does this for you, so you don't notice/know it.

Codepages & strings require careful setup. Contrary to popular belief, it does not 'just work'.

All this is documented:

https://www.freepascal.org/docs-html/current/ref/refsu9.html#x32-390003.2.4

Many people tend to ignore this, because Lazarus does a lot behind the scenes
(which is a good thing).

But if people use your JSONTools in a 'mixed' environment, you might
get strange results, if you ignore the correct and careful setup.

You control your environment, and jsontools will function correctly in your
environment. But it's a big world out there, where things might be happening
that you didn't foresee but which do influence jsontools.

I hope with my explanations, you are now well equipped/informed to strengthen
jsontools and help people should problems nevertheless pop up.

Now that we've hopefully established that fpjson does work correctly,
I would appreciate it if you could correct the JSON test comparison page you created.

Michael.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus

jsontools.png (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list
Okay, going back and looking through the messages I see you did post a test with:

{$codepage UTF8} and uses cwstring

Here are the results with that added:

On Linux using {$codepage UTF8} by itself causes both tests to fail. Adding cwstring causes both tests to work. On Windows trying to use cwstring causes the compilation to fail, but with {$codepage UTF8} added the tests work. I will try a few more tests, but there should be an "out of the box" option to get FPJson working without the need to add ifdefs along with extra directives added outside of the FPJson units themselves.

I will write a few more unicode tests, perhaps with 4 byte character strings, and some other potential unicode problems to be sure both are working before we come to a final resolution.

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list


On Sat, 31 Aug 2019, Anthony Walter via lazarus wrote:

> Okay, going back and looking through the messages I see you did post a test
> with:
>
> {$codepage UTF8} and uses cwstring
>
> Here are the results with that added:
>
> On Linux using {$codepage UTF8} by itself causes both tests to fail. Adding
> cwstring causes both tests to work. On Windows trying to use cwstring
> causes the compilation to fail, but with {$codepage UTF8} added the tests
> work. I will try a few more tests, but there should be an "out of the box"
> option to get FPJson working without the need to add ifdefs along with
> extra directives added outside of the FPJson units themselves.

Glad you picked it up.

See my other mail for more details.

Bottom line:
You simply cannot ignore this. Doing so is asking for problems.

It may work for you, but fail for someone else, and then you'll be
scratching your head as to "why on earth doesn't it work?"

Michael.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list


On Sat, 31 Aug 2019, Michael Van Canneyt via lazarus wrote:

>
>
> On Sat, 31 Aug 2019, Anthony Walter via lazarus wrote:
>
>> Okay, going back and looking through the messages I see you did post a test
>> with:
>>
>> {$codepage UTF8} and uses cwstring
>>
>> Here are the results with that added:
>>
>> On Linux using {$codepage UTF8} by itself causes both tests to fail. Adding
>> cwstring causes both tests to work. On Windows trying to use cwstring
>> causes the compilation to fail, but with {$codepage UTF8} added the tests
>> work. I will try a few more tests, but there should be an "out of the box"
>> option to get FPJson working without the need to add ifdefs along with
>> extra directives added outside of the FPJson units themselves.
>
> Glad you picked it up.
>
> See my other mail for more details.
>
> Bottom line:
> You simply cannot ignore this. Doing so is asking for problems.
>
> It may work for you, but fail for someone else, and then you'll be
> scratching your head as to "why on earth doesn't it work?"

One last thing.

Lazarus includes cwstring by default:

interfaces/carbon/interfaces.pas:  {$IFNDEF DisableCWString}cwstring,{$ENDIF}
interfaces/cocoa/interfaces.pas:  {$IFNDEF DisableCWString}cwstring,{$ENDIF}
interfaces/gtk2/interfaces.pas:{$IFDEF UNIX}{$IFNDEF DisableCWString}uses cwstring;{$ENDIF}{$ENDIF}
interfaces/gtk3/interfaces.pp:  {$IFDEF UNIX}{$IFNDEF DisableCWString}cwstring,{$ENDIF}{$ENDIF}
interfaces/qt5/interfaces.pp:  {$IFDEF UNIX}{$IFNDEF DisableCWString}cwstring,{$ENDIF}{$ENDIF}
interfaces/qt/interfaces.pp:  {$IFDEF UNIX}{$IFNDEF DisableCWString}cwstring,{$ENDIF}{$ENDIF}

If you look in the code, you'll see that it handles codepages explicitly in
many places.

Just to corroborate that ignoring this is not an option, and that lazarus
goes to great lengths to make it easier on people.

Michael.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list
Am 31.08.2019 um 09:45 schrieb Michael Van Canneyt via lazarus:
Codepages & strings require careful setup. Contrary to popular belief, it does not 'just work'.

All this is documented:

https://www.freepascal.org/docs-html/current/ref/refsu9.html#x32-390003.2.4

Many people tend to ignore this, because Lazarus does a lot behind the scenes (which is a good thing).
Looking at the text of the "Code page conversions" section: what do these mean: (CODE_CP ¡¿ CP_ACP) ? Or should it have been (CODE_CP <> CP_ACP)?

Regards,
Sven

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
Am 31.08.2019 um 11:08 schrieb Sven Barth:
Am 31.08.2019 um 09:45 schrieb Michael Van Canneyt via lazarus:
Codepages & strings require careful setup. Contrary to popular belief, it does not 'just work'.

All this is documented:

https://www.freepascal.org/docs-html/current/ref/refsu9.html#x32-390003.2.4

Many people tend to ignore this, because Lazarus does a lot behind the scenes (which is a good thing).
Looking at the text of the "Code page conversions" section: what do these mean: (CODE_CP ¡¿ CP_ACP) ? Or should it have been (CODE_CP <> CP_ACP)?
And there's another one in the section "UTF8String" at the bottom: (ordinal value ¡128) Should this have been (ordinal value < 128)?

Regards,
Sven

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list


On Sat, 31 Aug 2019, Sven Barth via lazarus wrote:

> Am 31.08.2019 um 09:45 schrieb Michael Van Canneyt via lazarus:
>> Codepages & strings require careful setup. Contrary to popular belief, it
>> does not 'just work'.
>>
>> All this is documented:
>>
>> https://www.freepascal.org/docs-html/current/ref/refsu9.html#x32-390003.2.4 
>>
>> Many people tend to ignore this, because Lazarus does a lot behind the
>> scenes (which is a good thing).
> Looking at the text of the "Code page conversions" section: what do these
> mean: (CODE_CP ¡¿ CP_ACP) ? Or should it have been (CODE_CP <> CP_ACP)?
Latex->html conversion errors, I suppose. Using $\lt$ or so should fix it.

Michael.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list
Could you include https://github.com/BeRo1985/pasjson in the comparison?

On Fri, Aug 30, 2019 at 4:22 PM Anthony Walter via lazarus <[hidden email]> wrote:
Alan, oh that's a good idea. I will do that as well as add a few more parser libraries as requested by a few people in other non mailing lists threads. I will also try to find out what's going on the unicode strings as it might be a problem with the compiler.

Michael,

I am on Linux as well, but I will test under Windows and Mac too.
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list
On Fri, Aug 30, 2019 at 11:02 PM Michael Van Canneyt via lazarus
<[hidden email]> wrote:

> Can you try setting defaultsystemcodepage to UTF8 ?

Feeling a little bit embarrassed now (I'm used to Lazarus which
defaults to that).
With DefaultSystemCodePage := CP_UTF8 it works:

Handles unicode chars correctly: >{ "name": "Joe®Schmoe",
"occupation": "bank teller \u00Ae " }<
Name    : 004A 006F 0065 00AE 0053 0063 0068 006D 006F 0065 [Joe®Schmoe]
Expected: 004A 006F 0065 00AE 0053 0063 0068 006D 006F 0065 [Joe®Schmoe]
Occupation: 0062 0061 006E 006B 0020 0074 0065 006C 006C 0065 0072
0020 00AE 0020 [bank teller ® ]
Expected: 0062 0061 006E 006B 0020 0074 0065 006C 006C 0065 0072 0020
00AE 0020 [bank teller ® ]
TRUE

--
Bart
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
In reply to this post by Free Pascal - Lazarus mailing list
> Could you include https://github.com/BeRo1985/pasjson in the comparison?

Sure. I also have a few other people have requested. I will also list the license of each in the first table.

Regarding a huge gigabytes of JSON in a file, I know a small portion of programmers of people might be inclined use it as an offline database format much like CSV. Even though by far most JSON is used with XMLHttpRequest, REST APIS, or storing settings and configurations, there are bound to be endless requests for use cases with JSON.

For example to accommodate the reading a huge files as indoividual records a helper class operating outside the definition of a JSON parser could accomplish this goal. For example, it would be relatively easy to write in a separate file:

type 
  TJsonStreamReader = class
  public
    constructor Create(Stream: TStream; OwnsStream: Boolean = False); 
    constructor CreateFromFile(const FileName: string); 
    destructor Destroy;
    function Read(out Parser: TSomeJsonParser): Boolean;
  end;

Then use as ...

var
  R: TJsonStreamReader;
  P: TSomeJsonParser;
begin
  R := TJsonStreamReader.Create(InputStreamOrFileName);
  try
    while R.Read(P) do
      // Read JSON record here
  finally
    R.Free;
  end;
end;

And in this way a large file could be read in small blocks and given back to the user as a parser to allow for processing of individual records. The benefit of breaking this into its own class is that you do not need to mix in every possible use case into the parser. You can simply write separate use cases into their own independent units, rather than trying to make a super class which handles every possible concern.

For example if wanted to store object state using RTTI in a JSON file, create a separate TJsonObjectState class to handle this for you. Or if you wanted to create a database table from a JSON file, or create a JSON file from a database table, then again write this into its own class.

The point is, saying this JSON class does lots of these things is the wrong approach (IMO), as these use case scenarios as likely endless and would add unnecessary cruft to a parser. Even designing a plug in or other extensible seems unnecessary, when simple separate classes to add functionality works as well without all the complexity.




--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list


On Sat, 31 Aug 2019, Anthony Walter wrote:

>> Could you include https://github.com/BeRo1985/pasjson in the comparison?
>
> Sure. I also have a few other people have requested. I will also list the
> license of each in the first table.
>

[snip]

> For example if wanted to store object state using RTTI in a JSON file,
> create a separate TJsonObjectState class to handle this for you. Or if you
> wanted to create a database table from a JSON file, or create a JSON file
> from a database table, then again write this into its own class.

Not sure I understand what you mean.

It seems to me that in that case you will repeat your scanner/parser code all over the place.
in case of an error, you need to fix it in as many places.

I can of course be wrong.

The current fpjson scanner/parser can be used anywhere.
You don't need to use fpjson data structures to be able to use the scanner or reader:
It's perfectly possible to use the scanner/parser to create TJSONNode from JSONTools.

But general usability comes indeed at the price of some speed loss.

That said, your classes are markedly slower when it comes to data manipulation.

The following is 100 (!) times slower than fpjson code:

{$mode objfpc}
{$h+}
uses dateutils,sysutils, jsontools;

Var
   I,N : Integer;
   D,E : TJSONNode;
   B : double;
   NT : TDateTime;

begin
   N:=10000000;
   D:=TJSONNode.Create;
   D.Parse('{ "d": 12345678.3 }');
   E:=D.Child(0);
   NT:=Now;
   B:=1;
   for i:=0 to N  do
     B:=E.AsNumber * 2;
   Writeln('Time ',MillisecondsBetween(Now,NT));
   D.Free;
end.

home:~> ./tb
Time 3888

Same program in fpJSON:

home:~> ./tb2
Time 32

This is because when accessing the value, you must do the conversion to
float.  Every time.

This is true for JSON string values as well: you must re-decode the JSON on
every access. And you do it over and over again, each time the data is accessed.

No doubt you can easily fix this by storing the value in the proper type, but this
will slow down your parser.

So:
if you use the resulting JSON a lot, code will run faster in fpJSON.

It thus boils down to a choice: do you need fast processing or fast parsing ?

In the end it will probably not matter: most likely all the nodes will be traversed
in a typical use case, and the overall time for your and my approach will be similar.

This is the danger of benchmarks. They focus on 1 aspect. In real life, all
aspects are usually present.

Anyway.

While coding this small test, I noticed that this also does not work in jsontools:

D : TJSONNode;

begin
   D.Parse('true'); // or D.Parse('12345678.3');
end.

An unhandled exception occurred at $00000000004730B0:
EJsonException: Root node must be an array or object

if you look at browser specs, this is supposed to work as well:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/parse

Also frequently encountered is omitting "" around property names. JSON is a
subset of Javascript:

D.Parse('{ d: 12345678.3 }');

Results in:

An unhandled exception occurred at $0000000000473075:
EJsonException: Error while parsing text

Both are things which are supported in fpJSON. No doubt you can fix this
easily.


So you see, with some extra study, the picture of what is "better", jsontool
or fpjson is not so clear as it may seem. In the end it all boils down to some choices.

Michael.

PS.
With 2 relatively simple changes, I took 40% off the parsing time of fpJSON.
No doubt more gain can be achieved, for example I didn't do the suggestion
by Benito yet.

--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] [fpc-pascal] Tests results of several pascal based JSON parsers

Free Pascal - Lazarus mailing list
El 31/8/19 a les 16:22, Michael Van Canneyt via lazarus ha escrit:
>
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/parse 
>
>
> Also frequently encountered is omitting "" around property names. JSON is a
> subset of Javascript:
>
> D.Parse('{ d: 12345678.3 }');


The parser at mozilla says: "Error: JSON.parse: expected property name
or '}' at line 1 column 3 of the JSON data"

Bye

--
Luca Olivetti
Wetron Automation Technology http://www.wetron.es/
Tel. +34 93 5883004 (Ext.3010)  Fax +34 93 5883007
--
_______________________________________________
lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
123