Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Directorysearcher problems with charactersets

134 views
Skip to first unread message

Peter Aragon

unread,
Apr 15, 2004, 11:15:59 AM4/15/04
to
Hi

I have a problem reading from our ldap server with DirectorySearcher.
For example when I'm retrieving data of people with a names like "René" or
"Jerôme" they appear in the SearchResult as "Ren" and "Jerme".
With a third party tool I can see that René is stored as "52 65 6E E9".
Because E9 >128 I'm affraid something is wrong with the DirectorySearcher
implementation, as I cannot retrieve those characters, even with the
Encoding GetString methods.

Anyone who knows a workaround?
Thanks,
Peter Aragon


Peter Aragon

unread,
Apr 15, 2004, 11:52:17 AM4/15/04
to
I see it is stored as a binary attribute in stead of a text attribute, maybe
this helps?

"Peter Aragon" <pf...@nospam.oce.nl> wrote in message
news:eC48IxvI...@tk2msftngp13.phx.gbl...

Rhett Gong [MSFT]

unread,
Apr 16, 2004, 6:20:24 AM4/16/04
to
Hi Peter,
From your description, you found that Ren is stored as "52 65 6E E9" in a 3rd party tool. Since E9 > 128, so you
thought there might be some error in the DirectorySearcher implementation.

Before we go further, could you answer the following questions please:
1>How did you use the DirectorySearcher in your code? And what attribute are you trying to retrieve? Could
you post a code snippet on it?
2>what's that 3rd tool? And how can I get the problem value from it?

Thanks. I am looking forward to hearing from you.

Have a great weekend!
Rhett Gong [MSFT]
Microsoft Online Partner Support

This posting is provided "AS IS" with no warranties, and confers no rights.
Please reply to newsgroups only. Thanks.

Peter Aragon

unread,
Apr 16, 2004, 6:51:53 AM4/16/04
to
Hi Rhett
 
The third party tool I used to find this out is Softerra LDAP Browser (2.5).
The name is stored in the "CN" property. For retrieving purposes this CN can be an array of names, so if you look for René you can also find him with Rene.
The following code is used.... I illimitated the need of a DirectorySearcher just to make sure the flaw isn't in DirectorySearcher.
strPath is a correct LDAP path that will uniquely point to one object.
The PrintObject function is from an MSDN article about setting Binary Properies Using System.DirectoryServices, but it says nothing about retrieving them correctly. I only get strings back instead of a byte array or something so I can use a Text decoder with the proper codepage to convert it to UTF8.
 

DirectoryEntry objDE = new DirectoryEntry(strPath, "", "", AuthenticationTypes.Anonymous);

PropertyValueCollection values = objDE.Properties["cn"];

object x = values.Value;

PrintObject(x, "hash");

 

static void PrintObject(object x, string name)

{

    object[] a = x as object[];

    System.Diagnostics.Debug.WriteLine(x.GetType());

    System.Diagnostics.Debug.WriteLine(a[0].GetType());

    System.Diagnostics.Debug.WriteLine(a.Length);

    for (int i = 0;  i < a.Length;  ++i)

    {

        System.Diagnostics.Debug.WriteLine(name + "[" + i + "] = " + a[i]);

    }

}

Thanks for the quick reply,

Peter Aragon

 

Rhett Gong [MSFT]

unread,
Apr 19, 2004, 8:14:39 AM4/19/04
to

I use following code and get the values.Value="Builtin"
//----------------------------------------------------------------------------------------------------
String strPath="LDAP://test.microsoft.com/CN=Builtin,DC=test,DC=microsoft,DC=com";
System.DirectoryServices.DirectoryEntry objDE = new System.DirectoryServices.DirectoryEntry(strPath);
System.DirectoryServices.PropertyValueCollection values = objDE.Properties["cn"];
//------------------------------------------------------------------------------------------------------

But I am still not clear on your problem, please help to answer following questions. Thanks.
1>How did you use DirectorySearcher for Rene? <--If i get this, I will know how to use "LDAP Browser" to check the value
2>What does "hash" mean in this line "PrintObject(x, "hash")"? Is this something special?
3>Could you post the problem code to let me see what happens on DirectorySearcher?

Have a nice day!

Peter Aragon

unread,
Apr 19, 2004, 8:28:24 AM4/19/04
to
Hi Rhett

The only problem at the moment is that I can only get a string back from the
PropertyValueCollection. The entry in our LDAP for people with 'strange'
accents in their names are is stored as binary information. However the
DirectoryServices don't allow me to get to this raw binary information, the
return value is always an 'interpreted' string. This string however does not
contain the 'strange' characters.

So I want to know if there is a way to obtain the 'true' binary value from
the LDAP in stead of the return 'interpreted' string from DirectoryServices.
If there is a way to tell DirectoryServies which codepage to use to convert
the binary information into a string, that would also be ok.

Somehow the team who built DirectoryServices forgot that there are
situations where LDAP entry values are not necessarily UTF-8, but can also
be western european ISO chars.

I don't have any more code to give you, as this is the only code needed to
show the problem. Add a binary value "52 65 6E E9" to an LDAP and try to get
it back with DirectoryServices as "René".

Thanks,
Peter


"Rhett Gong [MSFT]" <v-ra...@online.microsoft.com> wrote in message

news:JFrZvfg...@cpmsftngxa10.phx.gbl...

Rhett Gong [MSFT]

unread,
Apr 20, 2004, 1:26:43 AM4/20/04
to
Oh, I get it. Your string contains euro character "0x00E9 └ ", and from the DirectoryServices you get the string
as "Ren?"( 0x0052 0x0065 0x006E 0x003F ) , but it should be displayed as "Ren└". Can you set the locale to
the locale which could display the "└" correctly? And let me know the status please. Thanks.

Peter Aragon

unread,
Apr 20, 2004, 3:41:33 AM4/20/04
to
Hi Rhett
 
I even don't get the fourth character back, that's the issue... Else I could do the conversion myself for the few characters that have umlaut, apostrophe accant aggu, circonflex and whatever else I can mistype ;-). I only get back 3 characters in stead of four. Changing the codepage for the entire OS would not be an acceptable solution, and unicode is perfectly able to show those characters anyway.
I think internally the binary value is decoded using a fixed UTF8 decoding. What I need however is a UTF7 decoding. The following code points strongly to this:
 

byte[] binaryString = new byte[]{0x52,0x65,0x6e,0xe9, 0x20, 0x42, 0x72, 0xfc, 0x6e, 0x6b, 0x65, 0x6e};

//The effect occuring with DirectoryServices not showing René but Ren

//Only difference is that Brünken in DirectoryServices is shown as Br and nothing else.

MessageBox.Show( System.Text.Encoding.UTF8.GetString(binaryString) );

//The desired effect

MessageBox.Show( System.Text.Encoding.UTF7.GetString(binaryString) );

The only strange effect is that the ü causes DirectoryServices not to output the rest of the string.
Seems like a bug to be fixed for the next release?
 
Thanks,
Peter Aragon
 

Rhett Gong [MSFT]

unread,
Apr 20, 2004, 7:26:19 AM4/20/04
to
Hi Peter,
I use following code with user Ren¨¦.

DirectoryEntry objDE = new DirectoryEntry(strPath, "", "", AuthenticationTypes.Anonymous);
PropertyValueCollection values = objDE.Properties["cn"];
From my test on windows2000server (en), it returns the correct value "Ren¨¦". So, could you tell me what specific setting in your project or test environment you
are using to get the string broken?
I am looking for your reply. Thanks.

Rhett Gong [MSFT]

unread,
Apr 21, 2004, 11:52:31 PM4/21/04
to
Hi,
1> use following lines (unmanaged code) to see if you can get the full charaters especially 0xE9 returned at your client with different locale setting.
//---------------------------------------------------------------------------------
hr = ADsGetObject(L"LDAP://test.microsoft.com/CN=Ren└,CN=Users,DC=Test,DC=microsoft,DC=com",IID_IADs, (void **) & oUsr);
oUsr->Get(L"cn",&pProp);
//---------------------------------------------------------------------------------

2> using .net DirectoryServices: apply this code to see if it could resolve your problem,
//-----------------------------------------------------------------
// negotiate with ldap server what locale it used. Assume it is Franch, then try following code.
System.Threading.Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo("fr",true);
System.Threading.Thread.CurrentThread.CurrentUICulture = new System.Globalization.CultureInfo("fr",true);
//-----------------------------------------------------------------------

Please apply my suggestions and let me know your result. Thanks.

Peter Aragon

unread,
Apr 22, 2004, 3:41:42 AM4/22/04
to
Hi Rhett

To no avail.
Setting the CultureInfo to fr got me the error: Culture "fr" is a neutral
culture. It can not be used in formatting and parsing and therefore cannot
be set as the thread's current culture. So I set it to "fr-FR" just to prove
setting the culture is not the right solution, and I get the same results .
It's not how the strings are put on screen, even in Debug mode when
switching to binary mode I don't see 4 but 3 characters for René (Ren). ,
and unless I get a byte array back or am able to set the encoding for the
string conversion of DirectoryServices Using ADsGetObject will certainly
work, it's just that the .NET way shows regression or less functionality and
it is a reason not to switch to .NET. I'm looking at a showstopper for this
project. The strings in the LDAP are encoded in UTF7. Changing the LDAP to
UTF8 is also no option, as a lot of legacy application still rely on UTF7
and don't understand UTF8.

So could you please forward this to the people within the .NET group
responsible for DirectoryServices as a feature request/bug report? Too bad
me or others haven't spotted this behavior during the last 2 beta trails of
.NET 1.0 and 1.1. So what I would want is the raw Byte array in stead of a
wrongly decoded string value, or give a hint to DirectoryServices on the
decoding scheme. Else how could you retrieve binary information from LDAP
with .NET if you would only get back a string?

Thanks,
Peter Aragon

"Rhett Gong [MSFT]" <v-ra...@online.microsoft.com> wrote in message

news:8alJC1BK...@cpmsftngxa10.phx.gbl...


> Hi,
> 1> use following lines (unmanaged code) to see if you can get the full
charaters especially 0xE9 returned at your client with different locale
setting.
>
//--------------------------------------------------------------------------
-------
> hr =

ADsGetObject(L"LDAP://test.microsoft.com/CN=Ren¨¦,CN=Users,DC=Test,DC=micros

Rhett Gong [MSFT]

unread,
Apr 23, 2004, 1:44:20 AM4/23/04
to
> Else how could you retrieve binary information from LDAP
>with .NET if you would only get back a string?
It seems that the "raw Byte array" is marshalled as an object. So this problem is more related with com interop. So far
as I know, there is no supported way to get back the "raw Byte array" at present.

>So could you please forward this to the people within the .NET group
>responsible for DirectoryServices as a feature request/bug report?

I have sent an email to them. If I get any answer for this problem, I will post it here.

Thanks and have a good day!

0 new messages