Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Sort files collection ?

971 views
Skip to first unread message

Vilius Mock�nas

unread,
Aug 20, 2009, 9:51:12 AM8/20/09
to
Hello,

I use FileSystemObject to get files collection for some folder( Files
property )
How do I sort this collection by name by date and etc.
I can build sort logic of course but maybe there are easy standard ways ?

thanks
Vilius


ekkehard.horner

unread,
Aug 20, 2009, 10:17:37 AM8/20/09
to
Vilius Mockûnas schrieb:

(1) use .Run or .Exec to do a "dir /O ..." command and use its
output: simple, fast, but not very flexible (what hides behind
your "and etc"?)

(2) put the objects of the files collection in an array, write a
sort sub/function that uses the relevant property/ies for
comparison: a lot of work

(3) put the properties of the objects in the files collection in
a disconnected ADO recordset, use its .Sort method to solve
your problem: easy and flexible

mayayana

unread,
Aug 20, 2009, 11:04:21 AM8/20/09
to
Here's a basic script that alphabetizes all file
names in C:\Windows. It's not terribly complex,
but it would get more tricky if you want to sort
by date. This particular QuickSort sorts non-case-
sensitive. (Otherwise you get all capitalized
names first.) For dates, as an offhand guess, I
think you'd want to convert the dates to numeric
and then use another version of QuickSort in
which you drop the Ucase operation. (The sorting
uses > comparison, so it can be adapted to both
words and numbers.)

I was actually just exploring sort routines a few
days ago. Interesting stuff. On my old Win98SE
machine with a Sempron 1,800 Mhz CPU, a basic
QuickSort can sort close to 30,000 words per second
in a VBScript, and it doesn't seem to slow down with
increasing size. (The more well known bubble sort,
by comparison, is lucky to do 1,000 in a second and
quickly slows to the point of unusability as the array to
sort gets bigger.

'--------------------------------------------------
Sort all files in C:\Windows alphabetically
'------------------------------

Dim FSO, oFol, oFils, oFil, AFils(), i2
Set FSO = CreateObject("Scripting.FileSystemObject")
Set oFol = FSO.GetFolder("C:\Windows")
Set oFils = oFol.Files
ReDim AFils(oFils.count - 1)
i2 = 0
For Each oFil in oFils
AFils(i2) = oFil.Name
i2 = i2 + 1
Next
Set oFils = Nothing
Set oFol = Nothing
Set FSO = Nothing

QuickSort AFils, 0, 0

Dim S2
S2 = Join(AFils, vbCrLf)

'-- watch out for this. An oversize message box
'-- sometimes puts the close button offscreen. :)
MsgBox S2


Sub QuickSort(AIn, LBeg, LEnd)
Dim LBeg2, vMid, LEnd2, vSwap
If (LEnd = 0) Then LEnd = UBound(AIn)
LBeg2 = LBeg
LEnd2 = LEnd
vMid = UCase(AIn((LBeg + LEnd) \ 2))
Do
Do While UCase(AIn(LBeg2)) < vMid And LBeg2 < LEnd
LBeg2 = LBeg2 + 1
Loop
Do While vMid < UCase(AIn(LEnd2)) And LEnd2 > LBeg
LEnd2 = LEnd2 - 1
Loop
If LBeg2 <= LEnd2 Then
vSwap = AIn(LBeg2)
AIn(LBeg2) = AIn(LEnd2)
AIn(LEnd2) = vSwap
LBeg2 = LBeg2 + 1
LEnd2 = LEnd2 - 1
End If
Loop Until LBeg2 > LEnd2
If LBeg < LEnd2 Then QuickSort AIn, LBeg, LEnd2
If LBeg2 < LEnd Then QuickSort AIn, LBeg2, LEnd
End Sub

Eric

unread,
Aug 20, 2009, 3:49:01 PM8/20/09
to
If .NET is installed, use the .Sort method of a
CreateObject("System.Collections.ArrayList") object.
If not, do the ugly vbscript sort junk.
If you can get a directory listing to a file, use the directory options /OD
to sort.
If not, sorting by file date could get ugly.

"mayayana" <mayaX...@rcXXn.com> wrote in message
news:uOpH2baI...@TK2MSFTNGP04.phx.gbl...

mayayana

unread,
Aug 20, 2009, 4:09:43 PM8/20/09
to
> If .NET is installed, use the .Sort method of a
> CreateObject("System.Collections.ArrayList") object.
> If not, do the ugly vbscript sort junk.

That's an interesting logic: Loading a 200MB+
dependency, that won't be on all machines, is
less "ugly" than 20-odd lines of VBScript "junk"? :)


ekkehard.horner

unread,
Aug 20, 2009, 4:40:27 PM8/20/09
to
Eric schrieb:

> If .NET is installed, use the .Sort method of a
> CreateObject("System.Collections.ArrayList") object.

AFAIK, you can use the ArrayList only with one-dimensional arrays (with
items of the same type). So sorting a file collection (essentially a
table) will be difficult, perhaps even ugly in the sense that you
have to do silly things - like putting the sizes right aligned in the
front of the strings or formatting the m/d/y dates to make them
sortable.

> If not, do the ugly vbscript sort junk.

Why do you say that? What is wrong with mayayana's quicksort?

> If you can get a directory listing to a file, use the directory options /OD
> to sort.
> If not, sorting by file date could get ugly.

Using dir will force you to shell out for each order you need.

The best method - in my opinion - would be the ADO approach.

[...]

Richard Mueller [MVP]

unread,
Aug 20, 2009, 6:32:27 PM8/20/09
to

"ekkehard.horner" <ekkehar...@arcor.de> wrote in message
news:4a8db4bd$0$31331$9b4e...@newsspool4.arcor-online.net...

I would use the ADO disconnected recordset and the Sort method provided with
that object. I use it often. However, I don't know if it is faster than
mayayana's sort.

--
Richard Mueller
MVP Directory Services
Hilltop Lab - http://www.rlmueller.net
--


mayayana

unread,
Aug 20, 2009, 7:56:44 PM8/20/09
to

> I would use the ADO disconnected recordset and the Sort method provided
with
> that object. I use it often. However, I don't know if it is faster than
> mayayana's sort.
>

I'd be interested to know how that works,
and what it's strengths are. I've never used
ADO before. Would you have code or links
you could post?

I never looked into sorting until recently.
When I went searching I came across this
page, with code by Ellis Dee, who seems
to be somewhat of a specialist in sorting:

http://www.vbforums.com/showthread.php?t=473677

He presents a number of methods for VB6,
which were easily adaptable to VBS. I tested
4 of the most promising, along with the bubble
sort method that's best known. I finally settled
on QuickSort as being the fastest for sorting
words alphabetically. And conveniently, it's also
one of the most compact methods. (I wanted to
avoid using a method that spans more than 1
function.) My 5 test scripts, along with explanation
and my results, are here:

www.jsware.net/jsware/scripts.php5#sorting

Frankly, I have a hard time grasping the logic
of why some of these methods work so well.
There is a remarkable amount of variation... and
there are a surprising number of methods available.

Richard Mueller [MVP]

unread,
Aug 20, 2009, 9:16:32 PM8/20/09
to

"mayayana" <mayaX...@rcXXn.com> wrote in message
news:OTxXWFfI...@TK2MSFTNGP05.phx.gbl...

That's an interesting link. When I've coded my own I've always used a bubble
sort. I assume the Knuth shuffle refers to Donald Knuth, so I may have that
in his book.

I don't know what algorithm ADO uses, but I always assumed they took the
effort to find an efficient one. Even though I hate to require .NET, I found
a .NET function that was faster than other methods I tried by a factor of at
least 2 (and sometimes 10). This pasted from a post by Tom Lavedas last Dec.
1:
================
thought I'd mention that there is a
sort object available to script if one is willing to make use of
the .Net framework, for example ...

' List object available to scripting from .Net framework
Set aDataList = CreateObject("System.Collections.ArrayList")

aDataList.Add "A"
aDataList.Add "E"
aDataList.Add "D"
aDataList.Add "F"
aDataList.Add "B"
aDataList.Add "C"

' Just for demonstration ...
s = ""
For Each sItem in aDataList
s = s & sItem & vbNewline
Next
wsh.echo "Before:" & vbnewline & s

' Sort
aDataList.Sort()
'aDataList.Reverse() ' the last shall be first

' Convert to an array
s = ""
For Each sItem in aDataList
s = s & sItem & vbNewline
Next
aList = Split(s, vbNewline)

' Just for demonstration ...
wsh.echo "After:" & vbnewline & s

I haven't tried, but I suspect it's faster than the fully scripted
version. Clearly it is simpler code
===========
An simple example using ADO and a disconnected recordset follows:
===============
Const adVarChar = 200
Const adSmallInt = 2
Const adDBTimeStamp = 135
Const MaxCharacters = 255

' Setup disconnected recordset.
Set objDataList = CreateObject("ADODB.Recordset")
objDataList.Fields.Append "Description", adVarChar, MaxCharacters
objDataList.Fields.Append "Number", adSmallInt
objDataList.Fields.Append "Index", adSmallInt
objDataList.Fields.Append "Date", adDBTimeStamp, MaxCharacters
objDataList.Open

' Create a few records.
objDataList.AddNew
objDataList("Description") = "Start"
objDataList("Number") = 4
objDataList("Index") = 3
objDataList("Date") = #8/15/2007#
objDataList.Update

objDataList.AddNew
objDataList("Description") = "Start"
objDataList("Number") = 4
objDataList("Index") = 2
objDataList("Date") = #8/18/2007#
objDataList.Update

objDataList.AddNew
objDataList("Description") = "Start"
objDataList("Number") = 4
objDataList("Index") = 1
objDataList("Date") = #8/17/2007#
objDataList.Update

objDataList.AddNew
objDataList("Description") = "Middle"
objDataList("Number") = 2
objDataList("Index") = 3
objDataList("Date") = #8/15/2007#
objDataList.Update

objDataList.AddNew
objDataList("Description") = "End"
objDataList("Number") = 3
objDataList("Index") = 3
objDataList("Date") = #8/15/2007#
objDataList.Update

objDataList.Sort = "Number,Index"

' Display sorted values.
objDataList.MoveFirst
Do Until objDataList.EOF
Wscript.Echo objDataList.Fields.Item("Description") _
& "," & objDataList.Fields.Item("Number") _
& "," & objDataList.Fields.Item("Index") _
& "," & objDataList.Fields.Item("Date")
objDataList.MoveNext
Loop
objDataList.Close
==========
Generally you already have a recordset and you populate the diconnected
recordset in a loop, so you don't have so much code.

Paul Randall

unread,
Aug 20, 2009, 9:52:57 PM8/20/09
to
Searching and sorting are closely related. A comprehensive book on the
subject was written back in the 1970's:
http://search.half.ebay.com/knuth-searching-sorting_W0QQmZbooks
There are a lot of other sources on sorting algorithms and their benefits
and shortcomings and formulas on how the time for sorting similar sets of
objects varies with the number of items being sorted. Some algoriths have
widely varying execution times for a given number if items, depending on how
they are initially ordered. Other algorithms, such as Shell sort, have
similar execution times no matter what the initial order is.

Quicksort is so quick because it splits the dataset into two subsets between
which is a value correctly positioned; it then recursively works on each of
the subsets, doing the smaller subset first to conserve stack space. Often
a different algorithm is used for a subset containing only a few items.

QBasic (I think that is what it was called) that came with later versions of
MSDOS included some fun graphic with sound examples that demonstrated how
some common sorting algorithms work as well as their relative speeds. They
were designed with 50-100 MHZ computers; you would need a way to slow down
modern computers by a factor of 20 to 100 to be able to appreciate what you
see and hear.

-Paul Randall

"mayayana" <mayaX...@rcXXn.com> wrote in message

news:OTxXWFfI...@TK2MSFTNGP05.phx.gbl...

mayayana

unread,
Aug 21, 2009, 12:20:02 AM8/21/09
to
> Even though I hate to require .NET, I found
> a .NET function that was faster than other methods I tried by a factor of
at
> least 2 (and sometimes 10).
....

> ' List object available to scripting from .Net framework
> Set aDataList = CreateObject("System.Collections.ArrayList")
>

I'd be interested in .Net if it were all scriptable
COM, and if the runtime were not so gigantic. But as
it is, like Java, .Net seems to have very little purpose
outside of corporate intranet applet writing. I don't
have either the JVM or the .Net runtime installed
myself. I've never had any reason to do so, so it
would just be adding extra bloat and security risks
for no reason. And I wouldn't assume that others have
it installed. ...Which is not even getting into the
gargantuan resource load and initial lagtime that
System.Collections.ArrayList must be involve. (I
assume the whole 200+ MB of the .Net runtime
gets loaded for that call; and .Net software is
notoriously slow to get started because of that.)

But I suppose if one is writing only to a known
target audience that's known to be running .Net
software, then there's no reason not to
use what's available.

----------

With the code below: I don't do much with databases
and have never used ADO. So I find it hard to make
sense of what you wrote. The idea here is that you
have to create a record with several fields for each
item? And create a database for those records? In my
own sort tests I've been dropping a text file onto a script,
which uses Split(filetext, " ") to create an array of words
to sort. How does such an array get put into records?
Would it be something like:

For i = 0 to UBound(a)
objDataList.AddNew
objDataList("Word") = a(i)
objDataList.Update
Next

OR

For each fil in oFils
objDataList.AddNew
objDataList("Word") = Fil.Name
objDataList.Update
Next

Does that make sense? I'm not clear
about which fields are required or what
"Number" and "Index" are for.

And what do you mean by, "generally
you already have a recordset"? I was assuming
that the point was that ADO sorting could be
used to sort something like an array efficiently,
without necessarily having a database. Can it
really be more efficient to create a database
to be sorted, rather than just sorting an array?
The QuickSort I posted is virtually instant up
to several thousand items. It only took about
2 1/2 seconds to sort 70,000+ words.

mayayana

unread,
Aug 21, 2009, 12:33:11 AM8/21/09
to
> There are a lot of other sources on sorting algorithms and their benefits
> and shortcomings and formulas on how the time for sorting similar sets of
> objects varies with the number of items being sorted. Some algoriths have
> widely varying execution times for a given number if items, depending on
how
> they are initially ordered. Other algorithms, such as Shell sort, have
> similar execution times no matter what the initial order is.
>

Yes. I didn't do very extensive testing myself but
I did notice that. Bubble sort, especially, seems to get
slower and slower as the number of items increases.
Though with the exception of bubble sort, the methods I
was testing don't seem to matter all that much until one
gets into the hundreds of thousands or millions
of items. If a sort routine can sort, say, 10,000 items
in 150 ms, it may take 300 ms on the second run and
220 ms on the third. VBScript is just too crude to get
high accuracy in tests on that scale. But even though
there's a 200% difference in the range of results, they're
all essentially instant for most purposes. It's unlikely
that I'll ever need to sort more than a few hundred items. :)

Richard Mueller [MVP]

unread,
Aug 22, 2009, 12:17:21 PM8/22/09
to
mayayana wrote:

I use ADO to sort recordsets resulting from queries of Active Directory or
of SQL Server databases. You are correct that if the data is in an array,
you have the extra step of looping through the array and adding the values
to a disconnected recordset. Yes, this recordset is like a database, but it
is in memory and is much faster to work with than any database residing in a
file system. If you are sorting an array, I'm thinking the data came from
somewhere, and it might be possible to read the data into a disconnected
recordset in the first place instead of an array. I think it would be almost
as fast to populate a recordset as the array.

Following are two examples using disconnected ADO recordsets to sort. The
first converts an array of string values into a disconnected recordset to
sort. The second example retrieves all user names from Active Directory in
an ADO recordset, disconnects the recordset, and then sorts the values
before displaying. The example I posted earlier might have been confusing.
The "Number" and "Index" fields were just examples. My example below has
just one field.

This first example is one I used when I was comparing the performance of
several sort methods, including bubble sort (slowest), something called a
Benny sort (Benny Pedersen was active in the newsgroups), and using .NET to
sort (I have to admit the fastest method I found). This script shows that
the sort takes almost no time, but a fraction of a second is required to
setup the recordset and read the array values into it. It requires MDAC on
the client, but I believe any version will do:
==============
' ArraySort.vbs
' VBScript program to test several methods to sort.
Option Explicit

Dim arrAscending, adoDataList, strValue, intCount
Dim dtmT1, dtmT2, dtmT3

Const adVarChar = 200
Const MaxCharacters = 255

arrRandom = Array("Potato", "Lettuce", "Onion", "Bread", _
"Apple", "Orange", "Cherry", "Pear", "Milk", "Eggs", _
"Hamburger", "Ham", "Corn", "Beans", "Soup", "Pepper", _
"Salt", "Cornmeal", "Swiss Cheese", "Cheddar", "Basil", _
"Paprika", "Oregano", "Chili Powder", "Cottage Cheese", _
"French Fries", "Onion Rings", "Cola", "Waffles", "Peas", _
"Baked Beans", "Chili", "Pinto Beans", "Black Eyed Peas", _
"15 Bean Soup", "Cream", "Cream Cheese", "Yogurt", _
"Cake", "Ice Cream", "Cookies", "Butter", "Rye Bread", _
"Rice Cakes", "Whole Wheat Bread", "Butter Cookies", _
"Green Onion", "Celery", "Garlic Bread", "Pizza", _
"Tomato", "Grapes", "Black Beans", "Pancakes", "Red Beans", _
"Frozen Yogurt", "Tomato Paste", "Yeast", "Stewed Tomatoes", _
"French Toast", "Ginger Snaps", "Candy", "Chocolate", _
"Raisins", "Bay Leaves", "Rosemary", "Garlic Powder", _
"Thyme", "Macaroni", "Clam Chowder", "Split Pea")

dtmT1 = Timer()

' Setup disconnected recordset.
Set adoDataList = CreateObject("ADODB.Recordset")
adoDataList.Fields.Append "Value", adVarChar, MaxCharacters
adoDataList.Open

For Each strValue In arrRandom
adoDataList.AddNew
adoDataList("Value") = strValue
adoDataList.Update
Next

dtmT2 = Timer()

adoDataList.Sort = "Value"

dtmT3 = Timer()

' Display sorted values.
intCount = 0
adoDataList.MoveFirst
Do Until adoDataList.EOF
Wscript.Echo adoDataList.Fields.Item("Value")
intCount = intCount + 1
adoDataList.MoveNext
Loop
adoDataList.Close

Wscript.Echo "Number of values: " & CStr(intCount)
Wscript.Echo "ADO Sort setup: " & FormatNumber(dtmT2 - dtmT1, 4)
Wscript.Echo "ADO Sort : " & FormatNumber(dtmT3 - dtmT2, 4)
Wscript.Echo "ADO Sort total: " & FormatNumber(dtmT3 - dtmT1, 4)
==========
This second example shows how to sort all user names retrieved from Active
Directory. The trick is to specify the CursorLocation, CursorType, and
LockType to allow the recordset to be disconnected and sorted.
==========
Option Explicit

Dim objRootDSE, strDNSDomain, adoConnection
Dim strBase, strFilter, strAttributes, strQuery, adoRecordset
Dim strDN, intCount, strName

Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adUseClient = 3

' Determine DNS domain name.
Set objRootDSE = GetObject("LDAP://RootDSE")
strDNSDomain = objRootDSE.Get("defaultNamingContext")

' Use ADO to search Active Directory.
Set adoConnection = CreateObject("ADODB.Connection")
adoConnection.Provider = "ADsDSOObject"
adoConnection.Open "Active Directory Provider"

Set adoRecordset = CreateObject("ADODB.Recordset")
adoRecordset.ActiveConnection = adoConnection
adoRecordset.CursorLocation = adUseClient
adoRecordset.CursorType = adOpenStatic
adoRecordset.LockType = adLockOptimistic

' Search entire domain.
strBase = "<LDAP://" & strDNSDomain & ">"

' Filter on all user objects.
strFilter = "(&(objectCategory=person)(objectClass=user))"

' Comma delimited list of attribute values to retrieve.
strAttributes = "distinguishedName,sAMAccountName"

' Construct the LDAP query.
strQuery = strBase & ";" & strFilter & ";" & strAttributes & ";subtree"

' Run the query.
adoRecordset.Source = strQuery
adoRecordset.Open

' Disconnect the recordset.
Set adoRecordset.ActiveConnection = Nothing
adoConnection.Close

' Sort the recordset.
adoRecordset.Sort = "sAMAccountName"
adoRecordset.MoveFirst

' Enumerate the resulting recordset.
intCount = 0
Do Until adoRecordset.EOF
' Retrieve values.
strDN = adoRecordset.Fields("distinguishedName").Value
strName = adoRecordset.Fields("sAMAccountName").Value
Wscript.Echo strName & ": " & strDN
intCount = intCount + 1
adoRecordset.MoveNext
Loop

' Clean up.
adoRecordset.Close

Wscript.Echo "Number of objects: " & CStr(intCount)

mayayana

unread,
Aug 22, 2009, 1:30:18 PM8/22/09
to
Thanks for that detailed explanation. Ill have to
explore this. ADO is all new to me.

Richard Mueller [MVP]

unread,
Aug 22, 2009, 1:48:30 PM8/22/09
to

"mayayana" <mayaX...@rcXXn.com> wrote in message
news:ef4uv20I...@TK2MSFTNGP03.phx.gbl...

> Thanks for that detailed explanation. Ill have to
> explore this. ADO is all new to me.
>
>

I'll just add that there are GetRows and GetString methods of the
disconnected recordset object that can quickly convert into an array or
string. For example, adding to the first example I posted earlier:
========
' Convert recordset into an array.
adoDataList.MoveFirst
arrResults = adoDataList.GetRows

' Convert recordset into a semicolon delimited string.
Const adClipString = 2
adoDataList.MoveFirst
strResults = adoDataList.GetString(adClipString, , , ";", "<NULL>")
adoDataList.Close

' Enumerate the array.
intCount = 0
For Each strValue In arrResults
Wscript.Echo strValue


intCount = intCount + 1

Next
Wscript.Echo CStr(intCount)
' Display the string.
Wscript.Echo strResults
=====
I haven't found a quick way to convert an array into a recordset. But, you
can also apply filters to the disconnected recordset. For example:

adoDataList.MoveFirst
adoDataList.Filter = "Value > 'Fruit'"

And finally, you can sort descending with:

adoDataList.Sort = "Value DESC"

ekkehard.horner

unread,
Aug 22, 2009, 2:27:53 PM8/22/09
to
Richard Mueller [MVP] schrieb:
[...]

> I'll just add that there are GetRows and GetString methods of the
> disconnected recordset object that can quickly convert into an array or
> string. For example, adding to the first example I posted earlier:
> ========
> ' Convert recordset into an array.
> adoDataList.MoveFirst
> arrResults = adoDataList.GetRows

.GetRows() will return a two dimensional array with as many 'rows'
(first dimension) as there are columns and as many 'colums' (second
dimension) as there are elements in the one dimensional array you
put into the table.

> ' Convert recordset into a semicolon delimited string.
> Const adClipString = 2
> adoDataList.MoveFirst
> strResults = adoDataList.GetString(adClipString, , , ";", "<NULL>")
> adoDataList.Close

Three problems:
(1) you want an array (back) - so strResults has to be converted

(2) if you use Split on strResults, the choice if the delimiter
is critical: I would consider ";" as risky

(3) After aData = Split( strResults, ";" ), aData will contain
one more/last empty element than the array you started with,
because Split treats its second parameter as a separator.

[...]

mayayana

unread,
Aug 22, 2009, 3:52:52 PM8/22/09
to
I did some testing on this. It appears that
ADO has obvious advantages if one is dealing
with existing recordsets. It's fairly quick and
if I understand the Sort method correctly, it
seems to be sortable in more than 1 column,
so that names could be sorted within the same
date, for instance. (?)

For basic speed ADO turned out mediocre. It's
plenty fast for most uses, as are all sort methods
I tested except bubble sort. But it's not actually
very fast.

For my original sorting tests I use the following
code with each sort algorithm test script, to open
a dropped file and create an array:

Set FSO = CreateObject("Scripting.FileSystemObject")

Set TS = FSO.OpenTextFile(WScript.arguments(0), 1)
s1 = TS.ReadAll
TS.Close
Set TS = Nothing
A1 = Split(s1, " ")

I then call:

i = timer
[SortRoutine Call here]
i2 = timer

i2 - i1 is the measurement.

To test ADO I added the same arrray code
to the beginning of your sample. It turns out
that takes about 1/2 second to get an array.
It doesn't seem to matter much how big the
file is. I guess the FSO ops are probably most
of it and the Split is probably almost instant.

Starting with the array, I came up with the following:

_______________________
txt file of 12,277 words
------------------

QuickSort: 441 ms

ADO setup 773 ms
ADO Sort 609 ms
ADO Sort total 1,380 ms

________________
txt file of 73,705 words (ECMA 3 Reference 477 KB)
-------------------
QuickSort: 2,860 ms

ADO setup 4,179 ms
ADOSort 5,269 ms
ADO Sort total 9,449 ms

This was just two files tested, running the test
a handful of times, but it shows a trend similar
to other sort methods. If only the actual sorting
time is counted, ADO gets slower as the number
of items gets bigger. (That seems to be true with
most or all methods, except QuickSort. Or at
least it's less marked with QuickSort.)

My earlier tests on the 477 KB file yielded the
following results:

quick 2859 ms
shell 5382 ms
snake 17,406 ms
merge 31,468 ms

So ADO was comparable to ShellSort, if only
the sorting is counted. And it's faster to use
ADO, including setup, than it is to use SnakeSort
and MergeSort. (At least with large numbers of
items.) But even just the actual sorting itself is
notably slower than QuickSort.

Another variable I wasn't able to check: I've been
using non-case-sensitive sorting. (Even though my
speed tests have been for basic case-sensitive sorting.)
I want an alphabetical list regardless of case. I don't
see where that option comes in with ADO. I looked up
the Sort method and found there seems to be only an
option for ASC or DESC to pick the sort direction. There
doesn't seem to be an option to choose the sort definition.

My test script, using your basic code, is below:

Dim FSO, TS, s1, A1, Arg, i, i2, s2, iAsc

Set FSO = CreateObject("Scripting.FileSystemObject")

Set TS = FSO.OpenTextFile(WScript.arguments(0), 1)
s1 = TS.ReadAll
TS.Close
Set TS = Nothing

A1 = Split(s1, " ")

Dim arrAscending, adoDataList, strValue, intCount
Dim dtmT1, dtmT2, dtmT3

Const adVarChar = 200
Const MaxCharacters = 255

dtmT1 = Timer()

' Setup disconnected recordset.
Set adoDataList = CreateObject("ADODB.Recordset")
adoDataList.Fields.Append "Value", adVarChar, MaxCharacters
adoDataList.Open

For Each strValue In A1 'arrRandom


adoDataList.AddNew
adoDataList("Value") = strValue
adoDataList.Update
Next

dtmT2 = Timer()

adoDataList.Sort = "Value"

dtmT3 = Timer()

' Display sorted values.
intCount = 0
adoDataList.MoveFirst
Do Until adoDataList.EOF

' WScript.Echo adoDataList.Fields.Item("Value")


intCount = intCount + 1

adoDataList.MoveNext
Loop

adoDataList.Close
WScript.Echo "Number of values: " & CStr(intCount)
WScript.Echo "ADO Sort setup: " & FormatNumber(dtmT2 - dtmT1, 4)
WScript.Echo "ADO Sort : " & FormatNumber(dtmT3 - dtmT2, 4)
WScript.Echo "ADO Sort total: " & FormatNumber(dtmT3 - dtmT1, 4)


ekkehard.horner

unread,
Aug 22, 2009, 4:21:31 PM8/22/09
to
mayayana schrieb:

> I did some testing on this. It appears that
[...]

> Another variable I wasn't able to check: I've been
> using non-case-sensitive sorting. (Even though my
> speed tests have been for basic case-sensitive sorting.)
> I want an alphabetical list regardless of case. I don't
> see where that option comes in with ADO. I looked up
> the Sort method and found there seems to be only an
> option for ASC or DESC to pick the sort direction. There
> doesn't seem to be an option to choose the sort definition.

ADO sorts case insensitive (by default?). See:

===== Data: 10Strings
eins,Zwei,drei,vier,fünf,sechs,sieben,acht,neun,zehn
----- Func: ADOSimple
eins,Zwei,drei,vier,fünf,sechs,sieben,acht,neun,zehn
acht,drei,eins,fünf,neun,sechs,sieben,vier,zehn,Zwei
Falsch <== because my test function IsSorted is case sensitive, sigh

The challenge would be to get case sensitive sorting (maybe there
is a property of the recordset?)

[...]

> For Each strValue In A1 'arrRandom
> adoDataList.AddNew
> adoDataList("Value") = strValue
> adoDataList.Update
> Next

From my tests I assume that

aName = Array( "Value" )
aValue = Array( Empty )


For Each strValue In A1 'arrRandom

aValue = strValue
adoDataList.AddNew aName, aValue
Next

is faster.

Dr J R Stockton

unread,
Aug 22, 2009, 1:30:21 PM8/22/09
to
In microsoft.public.scripting.vbscript message <OEis0fhIKHA.1020@TK2MSFT
NGP02.phx.gbl>, Fri, 21 Aug 2009 00:33:11, mayayana
<mayaX...@rcXXn.com> posted:


Read <http://en.wikipedia.org/wiki/Sorting_algorithm> and its links.

--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
Do not Mail News to me. Before a reply, quote with ">" or "> " (SonOfRFC1036)

mayayana

unread,
Aug 23, 2009, 2:11:29 PM8/23/09
to

> ADO sorts case insensitive (by default?). See:

In that case the differences between methods
are not so marked. I get about 4.9 seconds on
the 477 KB file for non-case-sens. QuickSort.
So the ADO method is still a bit slower, even
if only the sorting itself is counted and the setup
cost is not included. But it's not slower by much.
It seems like the practical result is that QuickSort,
ShellSort and ADO are all quite adequate for almost
all needs, while BubbleSort should be discarded.

Though
I should note that some of my results have also
varied a great deal. In general I keep getting
consistent results on different tests, using varied
sorting methods, but there have been occasional
instances when I've got an unexpected result. For
instance, sometimes I've dropped the 477 KB file
onto my QS or ADO test scripts and got a result
about 5 seconds longer than the other runs! I have
no idea why.


mayayana

unread,
Aug 23, 2009, 3:37:51 PM8/23/09
to
OK... this may be more than anyone wants
to know at this point, but I've done some more tests. :)

The results are as follows:
----------------------------------------
sort test on 23 KB text file:

quick 156 ms
quick(NCS) 222 ms
shell 226 ms
snake 171 ms
merge 226 ms
ADO(NCS) 58/379 ms (actual sorting time / time to create ADO
Recorset, transfer array, sort, and return
sorted array.)
bubble 15,984 ms

---------------------------------
sort test on 477 KB plain text:

quick 2,859 ms
quick(NCS) 4,986 ms
shell 5,382 ms
ADO(NCS) 5,269/10,882 (actual sorting time / time to create ADO
Recorset, transfer array, sort, and
return
sorted array.)


snake 17,406 ms
merge 31,468 ms

---------------------------------

I don't have the .Net runtime installed, so I
can't test that. NCS stands for non-case-sensitive.
I rewrote an ADO script to do the whole operation
of returning the sorted array. The ADO times
show both numbers: actual sorting time, applicable
for people working within ADO, and total time as
it would be for someone using the ADO method on
an array.
The ADO script follows. (I didn't change the AddNew
method as per ekkehard's spec. because I didn't
understand that code, but I'm guessing the difference
is probably negligible.)

'---- drop a text file onto this script ---------
Dim FSO, TS, s1, A1, Arg, i, s2, A2()
Dim ADO, sWord
Dim T1, T2, T3, T4

Set FSO = CreateObject("Scripting.FileSystemObject")
Set TS = FSO.OpenTextFile(WScript.arguments(0), 1)
s1 = TS.ReadAll
TS.Close
Set TS = Nothing

A1 = Split(s1, " ")

T1 = Timer
' Setup disconnected recordset.
Set ADO = CreateObject("ADODB.Recordset")
ADO.Fields.Append "Value", 200, 255
ADO.Open

For i = 0 to UBound(A1)
ADO.AddNew
ADO("Value") = A1(i)
ADO.Update
Next

T2 = Timer

ADO.Sort = "Value"

T3 = Timer

'-- return the sorted array:
ADO.MoveFirst
A1 = ADO.GetRows
ReDim A2(UBound(A1, 2))
For i = 0 to UBound(A1, 2)
A2(i) = A1(0, i)
Next
A1 = A2
T4 = Timer
ADO.Close

MsgBox "ADO time required in milliseconds for " & CStr(UBound(A1)) & "
values:" & vbCrLf & "ADO setup: " & CStr((T2 - T1) * 1000) & vbCrLf &
"Actual Sorting: " & CStr((T3 - T2) * 1000) & vbCrLf & "Combined setup plus
sort: " & CStr((T3 - T1) * 1000) & vbCrLf & "Total time to return original
array sorted: " & CStr((T4 - T1) * 1000)

Set ADO = Nothing


Paul Randall

unread,
Aug 23, 2009, 3:51:03 PM8/23/09
to
I'm guessing that you are measuing 'wall clock' time, not time that the
processor is devoting to your sorting task. Perhaps you could disconnect
from the internet and use MSConfig to disable all unnecessary startup and
service stuff, and see if that gives you more consistent run times. If your
dataset is unchanging, then the sort times should be fairly consistent. If
the dataset is different each time, then your sort times can vary wildly.
Shell sort has a relatively small range of run times because it does
essentially the same number of comparisons for best- and worst- case
scenerios. The run time for any sorting algorithm is dependent on the total
number of comparisons that are made and the number of data moves that are
made. Quicksort slows down significantly for the worst case scenerio, where
the initial pivot point chosen for for each itteration needs to be moved a
lot. For this reason, some quicksort algorithms use a randomization
mechanism to choose the initial pivot point.

You could easily build a large dataset that is correctly sorted, and
backward sorted and a same-sized random data set, and instrument your
VBScript algorithms to count and report the number of comparisons and data
exchanges that are made for the various data sets, as a quick way to
understand sorting times a little better. You may find that in some cases,
bubble sort can be faster than QuickSort or any of the other 'fast' sorts.

-Paul Randall

"mayayana" <mayaX...@rcXXn.com> wrote in message

news:%23blFbyB...@TK2MSFTNGP03.phx.gbl...

mayayana

unread,
Aug 23, 2009, 5:48:36 PM8/23/09
to

> I'm guessing that you are measuing 'wall clock' time, not time that the
> processor is devoting to your sorting task.

"Wall clock time"? I'm just calling:

i = timer
dosort
i2 = timer

But I don't know how accurate that is in script.
We're talking ms, after all, and the message has
to go through wscript, then probably to some
API call that may not be so perfect. Actually, in one
case where I got a 5 second diff. it turned out there
was another instance of wscript running. So maybe
that was the issue there. In general I've found a
fair amount of variation on smaller tests, with less
variation on larger tests, so I guessed that
VBS just wasn't accurate enough for the small
measurements.

> Perhaps you could disconnect
> from the internet and use MSConfig to disable
> all unnecessary startup and
> service stuff,

I beg your pardon. I'm running Win98SE, with
about 8 total processes and none of that
services crap, thank you very much. :)

Paul Randall

unread,
Aug 23, 2009, 8:37:34 PM8/23/09
to

"mayayana" <mayaX...@rcXXn.com> wrote in message
news:%23%23EuvrDJ...@TK2MSFTNGP05.phx.gbl...

>
>
>> I'm guessing that you are measuing 'wall clock' time, not time that the
>> processor is devoting to your sorting task.
>
> "Wall clock time"? I'm just calling:
>
> i = timer
> dosort
> i2 = timer
>
> But I don't know how accurate that is in script.
> We're talking ms, after all, and the message has
> to go through wscript, then probably to some
> API call that may not be so perfect. Actually, in one
> case where I got a 5 second diff. it turned out there
> was another instance of wscript running. So maybe
> that was the issue there. In general I've found a
> fair amount of variation on smaller tests, with less
> variation on larger tests, so I guessed that
> VBS just wasn't accurate enough for the small
> measurements.

I'd say that your shortest times for each sort method/dataset combination
are the most accurate measure for comparing the efficiency of the various
sort methods. The longer times are almost always due to delays caused by
multitasking of your script with other tasks and the routine housekeeping
the computer is doing.

>> Perhaps you could disconnect
>> from the internet and use MSConfig to disable
>> all unnecessary startup and
>> service stuff,
>
> I beg your pardon. I'm running Win98SE, with
> about 8 total processes and none of that
> services crap, thank you very much. :)

I keep forgetting :-)

-Paul Randall


mr_unreliable

unread,
Sep 23, 2009, 7:49:11 AM9/23/09
to
Vilius Mock�nas wrote:
> How do I sort this collection by name by date and etc.
> I can build sort logic of course but maybe there are easy standard ways ?
>

Several years ago, Mike Harris wrote a Shell Sort and a
quick sort in (vb)script. He posted the result here,
(vbs ng) and on Clarence Washington's website:

http://cwashington.netreach.net/

A while back, Clarence stopped maintaining his website,
thanks to the introduction of "a-little-bundle-of-joy"
into his life (which now consumes all his spare time).
And, when last seen, he was closing his site. However,
nothing ever disappears from the web, and MikHar's sort
routines are probably still out there in some archival
site or other.

cheers, jw

Reventlov

unread,
Sep 24, 2009, 4:32:29 PM9/24/09
to
Il giorno Wed, 23 Sep 2009 07:49:11 -0400, mr_unreliable
<kindlyReply...@notmail.com> ha scritto:

In mayayana site there is a zip with several sorting code.
www.jsware.net

A list of files can be sorted using the switches of the dir command and redirecting the
output.
Or shell a sort command on the list (sort by name only).

--
Giovanni Cenati (Bergamo, Italy)
Write to "Reventlov" at katamail com
http://digilander.libero.it/Cenati (Esempi e programmi in VbScript)
--

tonyb

unread,
Nov 1, 2009, 9:36:45 PM11/1/09
to

I know this is a bit off thread, as this is just in normal Windows
usage, but I use the current ISO date/time string (yy;MM;ddthh;mm;ss) as
a unique identifier, pasting it before the file name, for any files I
want to keep in the order I filed them or received them (usually work
related).


--
tonyb

Dr J R Stockton

unread,
Nov 3, 2009, 12:34:53 PM11/3/09
to
In microsoft.public.scripting.vbscript message <22f6f6d408818bb58f99a98c
2b24...@nntp-gateway.com>, Sun, 1 Nov 2009 20:36:45, tonyb
<gu...@unknown-email.com> posted:

I do not see where ISO 8601:2004 allows either a two-digit year or a
semicolon as separator. While the first two digits of the year will not
change soon, they do serve to indicate that the field order is not MDY
of DMY, and is probably YMD.

--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05.
Web <URL:http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc : <URL:http://www.merlyn.demon.co.uk/programs/> - see 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.

Al Dunbar

unread,
Nov 4, 2009, 12:05:48 AM11/4/09
to

"Dr J R Stockton" <repl...@merlyn.demon.co.uk> wrote in message
news:Vyslo1P9...@invalid.uk.co.demon.merlyn.invalid...

> In microsoft.public.scripting.vbscript message <22f6f6d408818bb58f99a98c
> 2b24...@nntp-gateway.com>, Sun, 1 Nov 2009 20:36:45, tonyb
> <gu...@unknown-email.com> posted:
>>
>>I know this is a bit off thread, as this is just in normal Windows
>>usage, but I use the current ISO date/time string (yy;MM;ddthh;mm;ss) as
>>a unique identifier, pasting it before the file name, for any files I
>>want to keep in the order I filed them or received them (usually work
>>related).
>
> I do not see where ISO 8601:2004 allows either a two-digit year or a
> semicolon as separator. While the first two digits of the year will not
> change soon, they do serve to indicate that the field order is not MDY
> of DMY, and is probably YMD.

Dr. J., I am surprised, and even a bit shocked, to see you being almost an
apologist for a practice that has no particular justification ;-)

We relied on a similar trick in the previous century, during most of which
the two digit year would invariably be greater than the largest possible day
of month number. You may recall the name of the phenomenon that occurred
when it finally dawned on us that we were, effectively, planning our own
obsolescence. Kind of like my mother in law who back in the 1990's had a
tombstone sculpted for herself with the first two digits of the date of
death being pre-carved as "19". A common, and even practical, practice
earlier in the century. Apparently the few who were born in the 1890's and
who lived into the 21st century thought better of the idea and left the
slate completely blank as long as they could.

/Al

Dr J R Stockton

unread,
Nov 4, 2009, 6:32:43 PM11/4/09
to
In microsoft.public.scripting.vbscript message <#F#E5xQXKHA.4688@TK2MSFT
NGP06.phx.gbl>, Tue, 3 Nov 2009 22:05:48, Al Dunbar
<alan...@hotmail.com> posted:

>
>"Dr J R Stockton" <repl...@merlyn.demon.co.uk> wrote in message
>news:Vyslo1P9...@invalid.uk.co.demon.merlyn.invalid...
>> In microsoft.public.scripting.vbscript message <22f6f6d408818bb58f99a98c
>> 2b24...@nntp-gateway.com>, Sun, 1 Nov 2009 20:36:45, tonyb
>> <gu...@unknown-email.com> posted:
>>>
>>>I know this is a bit off thread, as this is just in normal Windows
>>>usage, but I use the current ISO date/time string (yy;MM;ddthh;mm;ss) as
>>>a unique identifier, pasting it before the file name, for any files I
>>>want to keep in the order I filed them or received them (usually work
>>>related).
>>
>> I do not see where ISO 8601:2004 allows either a two-digit year or a
>> semicolon as separator. While the first two digits of the year will not
>> change soon, they do serve to indicate that the field order is not MDY
>> of DMY, and is probably YMD.
>
>Dr. J., I am surprised, and even a bit shocked, to see you being almost
>an apologist for a practice that has no particular justification ;-)


Attribute it to a lack of understanding on your own part.

tonyb

unread,
Nov 6, 2009, 11:12:59 AM11/6/09
to

I must accept your castigation for not having done my homework properly.
The year in ISO format should indeed have 4 digits. Unfortunately,
although the correct seperator is a colon, windows does not allow this
in file names!


--
tonyb

Dr J R Stockton

unread,
Nov 7, 2009, 1:14:07 PM11/7/09
to
In microsoft.public.scripting.vbscript message <a770c07dcdb529f6d921d8db
0eb9...@nntp-gateway.com>, Fri, 6 Nov 2009 10:12:59, tonyb
<gu...@unknown-email.com> posted:

The correct time separator is a colon.

You can use - between the date fields and _ between the time fields.

Probably better to use the compact ISO date and time forms YYYYMMDD and
hhmmss, and to put a minus between them - 20091107-181247etc.etc .

Al Dunbar

unread,
Nov 9, 2009, 1:34:21 AM11/9/09
to

"Dr J R Stockton" <repl...@merlyn.demon.co.uk> wrote in message
news:28VN7Tsb...@invalid.uk.co.demon.merlyn.invalid...

> In microsoft.public.scripting.vbscript message <#F#E5xQXKHA.4688@TK2MSFT
> NGP06.phx.gbl>, Tue, 3 Nov 2009 22:05:48, Al Dunbar
> <alan...@hotmail.com> posted:
>>
>>"Dr J R Stockton" <repl...@merlyn.demon.co.uk> wrote in message
>>news:Vyslo1P9...@invalid.uk.co.demon.merlyn.invalid...
>>> In microsoft.public.scripting.vbscript message <22f6f6d408818bb58f99a98c
>>> 2b24...@nntp-gateway.com>, Sun, 1 Nov 2009 20:36:45, tonyb
>>> <gu...@unknown-email.com> posted:
>>>>
>>>>I know this is a bit off thread, as this is just in normal Windows
>>>>usage, but I use the current ISO date/time string (yy;MM;ddthh;mm;ss) as
>>>>a unique identifier, pasting it before the file name, for any files I
>>>>want to keep in the order I filed them or received them (usually work
>>>>related).
>>>
>>> I do not see where ISO 8601:2004 allows either a two-digit year or a
>>> semicolon as separator. While the first two digits of the year will not
>>> change soon, they do serve to indicate that the field order is not MDY
>>> of DMY, and is probably YMD.
>>
>>Dr. J., I am surprised, and even a bit shocked, to see you being almost
>>an apologist for a practice that has no particular justification ;-)
>
>
> Attribute it to a lack of understanding on your own part.

My apologies, however, it was more a lack of interpretation on my part than
understanding. To me it's not the value of digits in the year that indicate
the YMD order, but the fact that there are more than two of them, plus the
fact that YDM is the least likely order possible.

/Al

0 new messages