[Lazarus] Replacing accented letters

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
Hi,

Is there an easy way to replace accented letters (mostly the French
one's) with their not-accented equivalents ? E.g. é -> e.

I could do it with a lookup-table I think, but are there more efficient
ways ?

TIA,

Koenraad.
--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
You could use Unicode character decomposition.

For example, é (U+00E9) can be decomposed into an equivalent string of
the base letter e (U+0065) and combining acute accent (U+0301).

Then, you could simply delete combining acute accents, leaving just the
base letters.

Denis


On 04/10/2017 17:08, Koenraad Lelong via Lazarus wrote:

> Hi,
>
> Is there an easy way to replace accented letters (mostly the French
> one's) with their not-accented equivalents ? E.g. é -> e.
>
> I could do it with a lookup-table I think, but are there more
> efficient ways ?
>
> TIA,
>
> Koenraad.

--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
I needed it a long time ago and probably there is more simple way now.

Using utf8tools and LazUTF8: 
https://gist.github.com/zbyna/6d9cd98ca22fa4261f54a0a06a7e6f51


Dne 4.10.2017 v 18:19 Denis Kozlov via Lazarus napsal(a):

> You could use Unicode character decomposition.
>
> For example, é (U+00E9) can be decomposed into an equivalent string of
> the base letter e (U+0065) and combining acute accent (U+0301).
>
> Then, you could simply delete combining acute accents, leaving just
> the base letters.
>
> Denis
>
>
> On 04/10/2017 17:08, Koenraad Lelong via Lazarus wrote:
>> Hi,
>>
>> Is there an easy way to replace accented letters (mostly the French
>> one's) with their not-accented equivalents ? E.g. é -> e.
>>
>> I could do it with a lookup-table I think, but are there more
>> efficient ways ?
>>
>> TIA,
>>
>> Koenraad.
>

--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
Op 05-10-17 om 00:29 schreef Zbyněk Fiala via Lazarus:

> I needed it a long time ago and probably there is more simple way now.
>
> Using utf8tools and LazUTF8:
> https://gist.github.com/zbyna/6d9cd98ca22fa4261f54a0a06a7e6f51
>
>
> Dne 4.10.2017 v 18:19 Denis Kozlov via Lazarus napsal(a):
>> You could use Unicode character decomposition.
>>
>> For example, é (U+00E9) can be decomposed into an equivalent string of
>> the base letter e (U+0065) and combining acute accent (U+0301).
>>
>> Then, you could simply delete combining acute accents, leaving just
>> the base letters.
>>
>> Denis
>>
>>
>> On 04/10/2017 17:08, Koenraad Lelong via Lazarus wrote:
>>> Hi,
>>>
>>> Is there an easy way to replace accented letters (mostly the French
>>> one's) with their not-accented equivalents ? E.g. é -> e.
>>>
>>> I could do it with a lookup-table I think, but are there more
>>> efficient ways ?
>>>
>>> TIA,
>>>
>>> Koenraad.
>>
>
Hi,

I tried your routine, but is seems not to work. When I give a "normal"
string, I get the same string as result which is fine.
When I give a string with accented letters, I get an empty string as result.

Is there a way to identify the type of the string ? I.e. utf8, utf16, ...

Software is intended for Windows.

TIA,

Koenraad
--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
On 11.10.2017 17:31, Koenraad Lelong via Lazarus wrote:
> Software is intended for Windows.

In this case you may use WinAPI as well:
https://stackoverflow.com/a/1892432/1231269

Ondrej
--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
Op 11-10-17 om 17:45 schreef Ondrej Pokorny via Lazarus:
> On 11.10.2017 17:31, Koenraad Lelong via Lazarus wrote:
>> Software is intended for Windows.
>
> In this case you may use WinAPI as well:
> https://stackoverflow.com/a/1892432/1231269
>
> Ondrej

Hi Ondrej,

On that page there are two pieces of code, I tried them both.
One, OStripAccents, does not work, it returns te same string (AFAIK).
The other, BestFit, can't compile because WideCharToMultiByte is not
recognized. Does FPC have WideCharToMultiByte ? I searched the RTL (of
FPC 3.0) but I can't find it. I googled, but no relevant answer.

TIA
--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus
Reply | Threaded
Open this post in threaded view
|

Re: [Lazarus] Replacing accented letters

Free Pascal - Lazarus mailing list
On 13.10.2017 15:32, Koenraad Lelong via Lazarus wrote:

> Op 11-10-17 om 17:45 schreef Ondrej Pokorny via Lazarus:
>> On 11.10.2017 17:31, Koenraad Lelong via Lazarus wrote:
>>> Software is intended for Windows.
>>
>> In this case you may use WinAPI as well:
>> https://stackoverflow.com/a/1892432/1231269
>>
>> Ondrej
>
> On that page there are two pieces of code, I tried them both.
> One, OStripAccents, does not work, it returns te same string (AFAIK).

OStripAccents is Delphi 2009+ only.

> The other, BestFit, can't compile because WideCharToMultiByte is not
> recognized. Does FPC have WideCharToMultiByte ? I searched the RTL (of
> FPC 3.0) but I can't find it. I googled, but no relevant answer.

Add Windows to your uses clause.

Ondrej
--
_______________________________________________
Lazarus mailing list
[hidden email]
https://lists.lazarus-ide.org/listinfo/lazarus