ASCII Encoding missing PowerShell Studio 2020 5.7.173

Ask your Windows PowerShell-related questions, including questions on cmdlet development!
Forum rules
Do not post any licensing information in this forum.

Any code longer than three lines should be added as code using the 'Select Code' dropdown menu or attached as a file.
Locked
User avatar
KaterKarlo
Posts: 10
Meble kuchenne na zamówienie - na wymiar - Wrocław
Joined: Mon Mar 31, 2014 2:41 am

ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by KaterKarlo »

ASCII Encoding is missing in the encoding dropdown in PowerShell Studio 2020 5.7.173

User avatar
brittneyr
Site Admin
Posts: 428
Joined: Thu Jun 01, 2017 7:20 am

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by brittneyr »

This was done purposefully. Is there a particular reason as to why you would need your files encoded in ASCII?
Brittney Ryn
SAPIEN Technologies, Inc.

User avatar
KaterKarlo
Posts: 10
Joined: Mon Mar 31, 2014 2:41 am

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by KaterKarlo »

We use powershell scripts for instance to create email content with specific encodings. We need to have direct control over the script encoding itself, instead of using recoding mechanisms for output. The same applies to the use of regular expressions (e. g. iso-8859-1 special characters).

User avatar
Alexander Riedel
Posts: 7343
Joined: Tue May 29, 2007 4:43 pm

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by Alexander Riedel »

I apologize in advance if I don't understand this correctly. But you seem to imply that PowerShell's output encoding is somehow related to the encoding of the script producing the output.
I happen to know that this is not the case in a lot of corners of PowerShell.
Maybe you can elaborate a little more on what you are trying to accomplish. Perhaps we can recommend another way.
Windows 1252 encoding was removed, as it is not in much use anymore and simply is not supported in .NET Core 3.1 and thereby PowerShell 7.
Alexander Riedel
SAPIEN Technologies, Inc.

User avatar
KaterKarlo
Posts: 10
Joined: Mon Mar 31, 2014 2:41 am

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by KaterKarlo »

The iso-8859-n Encodings are used heavyly in Europe. Many systems e. g. throughout the supply chain still don't support utf-n encodings, irrespective of any Microsoft strategy. Writing a script in an encoding superset can introduce quite some nitty gritty problems. Example:

Take following code

$enc = [System.Text.Encoding]::GetEncoding("iso-8859-1")
$s = [System.IO.File]::ReadAllText('r:\Temp\test.txt', $enc)
$b = $s -match '^äöü.*'
Write-Host "Content: $s - Match: $b"

save it in utf-8 and iso-8859-1. Create the Testfile test.txt in iso-8859-1 with äöü as content.

Both scripts output äöü to the console but the match fails with the utf-8 script. I dont't say that this can't be handled but it introduces additional effort. If the support for non utf-n encodings would be permanently removed from Powershell Studio, we would discontinue using it.

User avatar
Alexander Riedel
Posts: 7343
Joined: Tue May 29, 2007 4:43 pm

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by Alexander Riedel »

Thank you for the detailed information. I will investigate and see what I can find out. I am a little surprised to hear that Europe hasn't moved altogether to UTF-16, considering it has the biggest pile of individual alphabets :D but I understand you have to work with what you have.
Alexander Riedel
SAPIEN Technologies, Inc.

User avatar
Alexander Riedel
Posts: 7343
Joined: Tue May 29, 2007 4:43 pm

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by Alexander Riedel »

Someone who has much more knowledge about PowerShell than yours truly has informed me that the problem herein lies with the use of PowerShell and reading an 'encoded' file.
As I suspected the script's encoding has little to do with that, even though you may find some coincidental pairings that work.
I will move this thread over to the PowerShell forum, so maybe someone can help out with doing that independent of your script's encoding.
Alexander Riedel
SAPIEN Technologies, Inc.

User avatar
Alexander Riedel
Posts: 7343
Joined: Tue May 29, 2007 4:43 pm

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by Alexander Riedel »

[Topic moved by moderator]
Alexander Riedel
SAPIEN Technologies, Inc.

jvierra
Posts: 14358
Joined: Tue May 22, 2007 9:57 am
Contact:

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by jvierra »

Hi Kater Karlo.

These articles may help to explain what is happening. The issue can occur when you convert text from on encoding to another. If we read text into PowerShell it will be converted to internal Unicode. ANSI files store character in the extended ASCII set in a way that is incompatible with Unicode so when we try to match characters they won't match. This appears to affect all conversions between UTF8 and ANSI files.

Here are some clues as to what may be happening:

https://www.i18nqa.com/debug/bug-double-conversion.html
https://www.i18nqa.com/debug/bug-utf-8-latin1.html
https://www.i18nqa.com/debug/table-iso8 ... -1252.html

Here is another good article about character encoding with ISO-8859-1 and Windows default of CP1252.

https://en.wikipedia.org/wiki/Windows-1252

To solve the issue you need to find out how the characters are stored in the original file. When you force the read to map against the ISO character set then the "match" works. That is because the ISO set was used to create the file. The file encoding (UTF-8 or ANSI) makes no difference. The file encoding of the script file makes no difference to this. The only issue is when you want to write the 8 bit out and the output encoding is not set correctly. The links will help to explain why this is.

jvierra
Posts: 14358
Joined: Tue May 22, 2007 9:57 am
Contact:

Re: ASCII Encoding missing PowerShell Studio 2020 5.7.173

Post by jvierra »

KaterKarlo wrote:
Thu Mar 05, 2020 10:15 am
We use powershell scripts for instance to create email content with specific encodings. We need to have direct control over the script encoding itself, instead of using recoding mechanisms for output. The same applies to the use of regular expressions (e. g. iso-8859-1 special characters).
TO directly address your issue of output encoding. The output encoding has nothing to do with the way a script file is encoded.
When using the iso-8859 character encoding the output file encoding should be UTF8. This allows the correct characters to be sent to the file.

Use the following:
Out-File -Encoding UTF8
Set-Content -encoding UTF8

Please note that file encoding is not the same thing as character encoding. ASCII is no a file encoding it is a character encoding specified by ANSI. This type of file defaults to Windows ASCII (CP-1252) when read by Windows unless otherwise specified. When storing as UTF-8 file encoding the character set set by the output encoding setting controls the character mapping into the file. If the characters are in PowerShell then their encoding will be translated to the output character encoding.

PowerShell default encoding is here: $OutputEncoding

in the US this is: "US-Ascii"

The text encoding is: [cultureinfo]::CurrentCulture.TextInfo

In the US this is: "Ansi Code Page 1252"

Locked