[Lazarus] cwstring in arm-linux

classic Classic list List threaded Threaded
93 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

[Lazarus] cwstring in arm-linux

Henry Vermaak
Hi list

In revision 31913, Felipe did this:

r31913 | sekelsenmat | 2011-08-08 12:23:08 +0100 (Mon, 08 Aug 2011) | 1 line

Fixes linking LCL-Android by disabling all links to libc in arm-linux

--- lcl/include/lcl_defines.inc (revision 31912)
+++ lcl/include/lcl_defines.inc (revision 31913)
@@ -1,2 +1,8 @@
  // Add defines here. This file should be included in all LCL units headers
-{$define UseCLDefault}
\ No newline at end of file
+{$define UseCLDefault}
+
+// For Android and other ARM-devices, otherwise the LCL will dependent
on libc
+{$IFDEF ARM}{$IFDEF UNIX}
+  {$DEFINE DisableCWString}
+  {$DEFINE DisableIconv}
+{$ENDIF}{$ENDIF}

Could someone change this ifdef to ANDROID, or something other than ARM?

Henry

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
On Mon, Oct 3, 2011 at 2:18 PM, Henry Vermaak <[hidden email]> wrote:
> Could someone change this ifdef to ANDROID, or something other than ARM?

There is no such define in FPC, so either way is problematic.

I'll check if I can finish paswstring instead.

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Henry Vermaak
On 03/10/11 13:52, Felipe Monteiro de Carvalho wrote:
> On Mon, Oct 3, 2011 at 2:18 PM, Henry Vermaak<[hidden email]>  wrote:
>> Could someone change this ifdef to ANDROID, or something other than ARM?
>
> There is no such define in FPC, so either way is problematic.

Yes, I know, but you can define ANDROID when you build lazarus from the
command line.  All my arm-linux systems have libc.  I run a normal linux
distro on my arm netbook, so the only alternative for me would be to
patch that file.

Henry

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
Ok, I changed the define in rev 32655.

But you should note that when paswstring gets finished it will phase
out cwstrings.

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Henry Vermaak
On 03/10/11 15:31, Felipe Monteiro de Carvalho wrote:
> Ok, I changed the define in rev 32655.
>
> But you should note that when paswstring gets finished it will phase
> out cwstrings.

That's good news, thanks!

Henry

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

zeljko
In reply to this post by Felipe Monteiro de Carvalho

On Monday 03 of October 2011 16:31:20 Felipe Monteiro de Carvalho wrote:

> Ok, I changed the define in rev 32655.

>

> But you should note that when paswstring gets finished it will phase

> out cwstrings.


Only if it's 1/1 with cwstrings please :)


zeljko



--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
In reply to this post by Henry Vermaak
On Mon, Oct 3, 2011 at 4:35 PM, Henry Vermaak <[hidden email]> wrote:
> That's good news, thanks!

Hello, Could you test the very latest Pascal Widestring Manager? Just
disable cwstring and then add paswstring as the first unit in your
projects uses clause.

The Pascal Widestring Manager is completed, but it needs more testing =)

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Marco van de Voort
In reply to this post by Felipe Monteiro de Carvalho
On Mon, Oct 03, 2011 at 04:31:20PM +0200, Felipe Monteiro de Carvalho wrote:
> Ok, I changed the define in rev 32655.
>
> But you should note that when paswstring gets finished it will phase
> out cwstrings.

Not that I know. And btw, I also use arm-linux without android, so please
keep that target intact and aligned with normal linux ports.

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
On Wed, Oct 19, 2011 at 12:06 PM, Marco van de Voort <[hidden email]> wrote:
> Not that I know. And btw, I also use arm-linux without android, so please
> keep that target intact and aligned with normal linux ports.

What is the difference between using cwstring and paswstring? Any
reason for not wanting to use paswstring?

They should be 100% equal, except that one does not require any
external libraries. If you can test and check if there are any
differences of course would be excelent =)

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Martin Schreiber
On Wednesday 19 October 2011 13.14:50 Felipe Monteiro de Carvalho wrote:
> On Wed, Oct 19, 2011 at 12:06 PM, Marco van de Voort <[hidden email]>
wrote:
> > Not that I know. And btw, I also use arm-linux without android, so please
> > keep that target intact and aligned with normal linux ports.
>
> What is the difference between using cwstring and paswstring? Any
> reason for not wanting to use paswstring?
>
Where is paswstring?

Martin

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Hans-Peter Diettrich
In reply to this post by Marco van de Voort
Marco van de Voort schrieb:
> On Mon, Oct 03, 2011 at 04:31:20PM +0200, Felipe Monteiro de Carvalho wrote:
>> Ok, I changed the define in rev 32655.
>>
>> But you should note that when paswstring gets finished it will phase
>> out cwstrings.
>
> Not that I know. And btw, I also use arm-linux without android, so please
> keep that target intact and aligned with normal linux ports.

After some discussions in Embarcadero groups I would like to learn more
about the FPC implementation and goals of the new (Unicode...) strings.
Where should I have a look?

In detail it turned out that Delphi only supports CP_ACP strings for
Ansi codepages, not including UTF-8. Strings with other encodings may be
converted properly (not yet), but otherwise should not be used with
standard stringhandling procedures. Will this be changed in the FPC RTL,
so that at least UTF8Strings are also supported properly?

DoDi


--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
In reply to this post by Martin Schreiber
On Wed, Oct 19, 2011 at 1:24 PM, Martin Schreiber <[hidden email]> wrote:
> Where is paswstring?

http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/components/lazutils/paswstring.pas?view=markup&root=lazarus

It uses lazutf8 (which includes most importantly UTF16ToUTF8 and
viceversa and utf8LowerCase and utf8UpperCase) and lconvencoding
(which includes encoding tables) which are in the same folder.

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Sven Barth
In reply to this post by Hans-Peter Diettrich
Am 19.10.2011 14:08, schrieb Hans-Peter Diettrich:

> Marco van de Voort schrieb:
>> On Mon, Oct 03, 2011 at 04:31:20PM +0200, Felipe Monteiro de Carvalho
>> wrote:
>>> Ok, I changed the define in rev 32655.
>>>
>>> But you should note that when paswstring gets finished it will phase
>>> out cwstrings.
>>
>> Not that I know. And btw, I also use arm-linux without android, so please
>> keep that target intact and aligned with normal linux ports.
>
> After some discussions in Embarcadero groups I would like to learn more
> about the FPC implementation and goals of the new (Unicode...) strings.
> Where should I have a look?
>
> In detail it turned out that Delphi only supports CP_ACP strings for
> Ansi codepages, not including UTF-8. Strings with other encodings may be
> converted properly (not yet), but otherwise should not be used with
> standard stringhandling procedures. Will this be changed in the FPC RTL,
> so that at least UTF8Strings are also supported properly?

Uhm... isn't this better suited in fpc-devel?

Regards,
Sven


--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Martin Schreiber
In reply to this post by Felipe Monteiro de Carvalho
On Wednesday 19 October 2011 13.25:09 Felipe Monteiro de Carvalho wrote:
> On Wed, Oct 19, 2011 at 1:24 PM, Martin Schreiber <[hidden email]>
wrote:
> > Where is paswstring?
>
> http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/components/lazutils/pasw
> string.pas?view=markup&root=lazarus
>
> It uses lazutf8 (which includes most importantly UTF16ToUTF8 and
> viceversa and utf8LowerCase and utf8UpperCase) and lconvencoding
> (which includes encoding tables) which are in the same folder.

Some possibly problematic points:
Does it use locale specific collation in PasUnicodeCompareStr and
PasUnicodeCompareText?
Is the performance of UTF8LowerCase and UTF8UpperCase OK?
Do  UTF8LowerCase and UTF8UpperCase cover all upper/lowercase Unicode
(possibly accented) characters? Does it handle decomposed characters (cwstring
doesn't)?

Martin

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Marco van de Voort
In reply to this post by Felipe Monteiro de Carvalho
On Wed, Oct 19, 2011 at 01:14:50PM +0200, Felipe Monteiro de Carvalho wrote:
> On Wed, Oct 19, 2011 at 12:06 PM, Marco van de Voort <[hidden email]> wrote:
> > Not that I know. And btw, I also use arm-linux without android, so please
> > keep that target intact and aligned with normal linux ports.
>
> What is the difference between using cwstring and paswstring? Any
> reason for not wanting to use paswstring?

Simply integrating with the OS, and avoid inclusion of tables when not
necessary.

Moreover you are stating something as a fact here that was not discussed at
all.
 
> They should be 100% equal, except that one does not require any
> external libraries. If you can test and check if there are any
> differences of course would be excelent =)

I haven't been testing it, and don't plan to. I'm not interested in it, and
am not interested in growing the binaries unnecessarily.

I have no problem with having a second option for the people that do want
it, but that is something entirely different from what you were saying.

Cwstring is staying on all normal targets as far as I know.

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
On Wed, Oct 19, 2011 at 6:47 PM, Marco van de Voort <[hidden email]> wrote:
> Moreover you are stating something as a fact here that was not discussed at
> all.

I am confused by your statements, the discussion here is about the
usage of cwstring in the LCL, then I said that I want to replace
cwstring with paswstring in the LCL (after making sure it is
completely equivalent).

Are you also discussing about the usage of cwstring in the LCL? Your
comments make me think that you are assuming I am talking about the
RTL or something like that.

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Martin Schreiber
On Wednesday 19 October 2011 18.59:06 Felipe Monteiro de Carvalho wrote:

> On Wed, Oct 19, 2011 at 6:47 PM, Marco van de Voort <[hidden email]> wrote:
> > Moreover you are stating something as a fact here that was not discussed
> > at all.
>
> I am confused by your statements, the discussion here is about the
> usage of cwstring in the LCL, then I said that I want to replace
> cwstring with paswstring in the LCL (after making sure it is
> completely equivalent).
>
> Are you also discussing about the usage of cwstring in the LCL? Your
> comments make me think that you are assuming I am talking about the
> RTL or something like that.

Ah, sorry, I read it wrong too...

Martin

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Marco van de Voort
In reply to this post by Felipe Monteiro de Carvalho
On Wed, Oct 19, 2011 at 06:59:06PM +0200, Felipe Monteiro de Carvalho wrote:
> I am confused by your statements, the discussion here is about the
> usage of cwstring in the LCL, then I said that I want to replace
> cwstring with paswstring in the LCL (after making sure it is
> completely equivalent).
>
> Are you also discussing about the usage of cwstring in the LCL? Your
> comments make me think that you are assuming I am talking about the
> RTL or something like that.

No, sorry. Though I still think that is not a good thing either.

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Vincent Snijders
2011/10/19 Marco van de Voort <[hidden email]>:

> On Wed, Oct 19, 2011 at 06:59:06PM +0200, Felipe Monteiro de Carvalho wrote:
>> I am confused by your statements, the discussion here is about the
>> usage of cwstring in the LCL, then I said that I want to replace
>> cwstring with paswstring in the LCL (after making sure it is
>> completely equivalent).
>>
>> Are you also discussing about the usage of cwstring in the LCL? Your
>> comments make me think that you are assuming I am talking about the
>> RTL or something like that.
>
> No, sorry. Though I still think that is not a good thing either.

I guess Felipe gave up waiting on a Unicode RTL for the time being and
goes for a full UTF8 pseudo RTL in LazUtils.

Vincent

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho
In reply to this post by Martin Schreiber
On Wed, Oct 19, 2011 at 6:33 PM, Martin Schreiber <[hidden email]> wrote:
> Does it use locale specific collation in PasUnicodeCompareStr and
> PasUnicodeCompareText?

Good point, no, not yet. But this affects only turkish, azeri and
lithuanian AFAIK

Adding turkish and azeri is trivial, because UTF8LowerCase supports
them, but I did not understand yet the rules for Lithuanian, they are
quite convoluted, depend on nearby chars and stuff like that.

> Is the performance of UTF8LowerCase and UTF8UpperCase OK?

UTF8LowerCase was heavily optimized. UTF8UpperCase still needs to be
more optimized.

6 million UTF8LowerCase operations in the string "АБВЕЁЖЗКЛМНОПРДЙГ"
takes 2,6 seconds in my computer. It outperforms iconv by a factor of
2,5x aprox:

    UTF8LowerCase-- Performance test took:         804 ms     1896 ms
   2318 ms     3460 ms     2647 ms     1847 ms     2526 ms     2496 ms
    1830 ms     1975 ms
CWString SysUtils.UnicodeLowerCase-- Performance test took:
2456 ms     2461 ms     6594 ms     6170 ms     5347 ms     6939 ms
 4398 ms     4429 ms     2285 ms     2411 ms

For this strings:

      if j = 0 then Str := UTF8LowerCase('abcdefghijklmnopqrstuwvxyz');
      if j = 1 then Str := UTF8LowerCase('ABCDEFGHIJKLMNOPQRSTUWVXYZ');
      if j = 2 then Str := UTF8LowerCase('aąbcćdeęfghijklłmnńoóprsśtuwyzźż');
      if j = 3 then Str := UTF8LowerCase('AĄBCĆDEĘFGHIJKLŁMNŃOÓPRSŚTUWYZŹŻ');
      if j = 4 then Str := UTF8LowerCase('АБВЕЁЖЗКЛМНОПРДЙГ');
      if j = 5 then Str := UTF8LowerCase('名字叫嘉英,嘉陵江的嘉,英國的英');
      if j = 6 then Str :=
UTF8LowerCase('AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuWvVwXxYyZz');
      if j = 7 then Str :=
UTF8LowerCase('AAaaBBbbCCccDDddEEeeFFffGGggHHhhIIiiJJjjKKkkLLllMMmm');
      if j = 8 then Str := UTF8LowerCase('abcDefgHijkLmnoPqrsTuwvXyz');
      if j = 9 then Str := UTF8LowerCase('ABCdEFGhIJKlMNOpQRStUWVxYZ');

> Do  UTF8LowerCase and UTF8UpperCase cover all upper/lowercase Unicode
> (possibly accented) characters?

UTF8LowerCase currently covers all characters in the latest Unicode
spec AFAIK. Of course I might have forgotten something, but I have
tests for chars from 0000 to 0580 and more tests for other clusters.

UTF8UpperCase is currently implemented from 0000 to 0450, but I will
add the rest.

> Does it handle decomposed characters (cwstring doesn't)?

I think that decomposed characters should work naturally. See, for
example, if we have: [0]=~ (tilde accent, but the special version for
composition) [1]=A which forms "Ã" and then we pass lowercase into it,
we would get [0] without change and [1]=a which forms "ã". Or am I
wrong?

If you are talking about handling for CompareText, then the answer
would be that AFAIK it would be too inneficient to handle that in
CompareText ... so we would need another routine for that
NormalizedCompareText or something like that, which executes
normalization, then lowercase and finally the comparison.

--
Felipe Monteiro de Carvalho

--
_______________________________________________
Lazarus mailing list
[hidden email]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
12345