Performance issues with #() and very large lists?

27 views
Skip to first unread message

Cristos Lianides-Chin

unread,
Mar 18, 2015, 5:17:12 PM3/18/15
to fmsta...@googlegroups.com
I'm using the standard Let-notation function "#" to try and create a dictionary from a very large list of approx. 500,000 primary key IDs. However, it is taking almost a minute to process.

Set Variable [ $dictionary = #("pkIdList" ; $listOfPkIds) ]

When I build the dictionary manually using Quote(), it is done in a few seconds. If I take the quoted list (of ~500,000 IDs) and try to add it, it takes just about as long.
Set Variable [ $quotedPkIds = Quote ( $listOfPkids ) ] // this takes a couple of seconds
Set Variable [ $manualDictionary = "$pkIdList = " & $quotedPkIds & ";¶" ] // this is instant
Set Variable [ $dictionary = #("pkIdList" ; $quotedPkIds) ] // this still takes almost a minute

I'm not sure what part of the #() function is causing the slowdown; I'm guessing it's something that secretly operates recursively under the hood, but I still don't fully grok the internals of #(). Any suggestions on what I should look at or what can be optimized?

Here's the internals of #() for reference. Thanks!

/**
 * =====================================
 * # ( name ; value )
 *
 * RETURNS:
 *        An name-value pair in Let notation.
 *
 * PARAMETERS:
 *        name: The name for the returned name-value pair. name can be any value
 *        that would be a valid Let() variable name.
 *        value: The value for the returned name-value pair.
 *
 * EXAMPLE:
 *        # ( "name"; $value ) & # ( "foo" ; "bar" )
 *
 * DEPENDENCIES: none
 *

 *
 * REFERENCES:
 *        https://github.com/filemakerstandards/fmpstandards/blob/master/Functions/%23Name-Value/%23.fmfn
 * =====================================
 */


Let ( [
   
~name =    // strip leading "$$" and "$"
       
Substitute (
           
"/*start*/" & name ;
           
[ "/*start*/$$" ; "" ] ;
           
[ "/*start*/$" ; "" ] ;
           
[ "/*start*/" ; "" ]
       
) ;
   
~plusOneText = GetAsText ( value + 1 ) ;
   
~isValidDate = not EvaluationError ( GetAsDate ( value ) ) ;
   
~isValidTime = not EvaluationError ( GetAsTime ( value ) ) ;
   
~number = GetAsNumber ( value ) ;
   
~value =
       
Case (
            value
= "" or value = "?" or ~number = "?" ;
               
Quote ( value ) ;

           
~isValidDate
           
and ~isValidTime
           
and GetAsText ( GetAsTimestamp ( value ) + 1 ) = ~plusOneText ;
               
"GetAsTimestamp ( " & Quote ( value ) & " )" ;

           
~isValidTime
           
and GetAsText ( GetAsTime ( value ) + 1 ) = ~plusOneText ;
               
"GetAsTime ( " & Quote ( value ) & " )" ;

           
~isValidDate
           
and GetAsText ( GetAsDate ( value ) + 1 ) = ~plusOneText ;
               
"GetAsDate ( " & Quote ( value ) & " )" ;

            value
~number ;
               
Quote ( value ) ;

           
/* Else */
               
~number
       
) ;
   
~result =
       
"$"
       
& ~name
       
& " = "
       
& ~value
       
& " ;¶" ;
   
~testExpression =
       
"Let ( [ "
       
& ~result
       
& " ~ = \"\" ]; \"\" )" ;
   
~error =
       
Case (
           
IsEmpty ( ~name ) or Position ( ~name ; ; 1 ; 1 ) 0 ;
               
11 ;    // Name is not valid

           
not IsValidExpression ( ~testExpression ) ;
               
1200    // Generic calculation error
       
)
];
   
If ( ~error ;    // prevent bad pairs from affecting evaluation by commenting
       
"/* Error "
       
& ~error
       
& " name: "
       
& Quote (
           
Substitute (    // escape comment character sequences
                name
;
               
[ "*/" ; "\*\/" ] ;
               
[ "/*" ; "\/\*" ]
           
)
       
)
       
& " value: "
       
& Quote (
           
Substitute (    // escape comment character sequences
                value
;
               
[ "*/" ; "\*\/" ] ;
               
[ "/*" ; "\/\*" ]
           
)
       
)
       
& " */"
       
& ;
   
/* Else */
       
~result
   
)
)


Jeremy Bante

unread,
Mar 18, 2015, 5:25:08 PM3/18/15
to fmsta...@googlegroups.com
The # ( name ; value ) function has added A LOT from its more simple earlier forms, mostly for inferring the data type of the value. It does not surprise me that it starts to choke on larger data sets. A more performance sensitive approach might be to have separate versions of the function for each type that the developer can select explicitly, rather than relying on the function to detect it for us. Performance has always been one of the albatrosses around the neck of dynamic typing. So the version for text might be just:

// #Text ( name ; value )
"$" & $name & " = " & Quote ( value ) & " ;¶"

Daniel Smith

unread,
Mar 18, 2015, 6:20:13 PM3/18/15
to filemakerstandards.org
I've done extensive speed testing of the # function recently because I am using it in fmQBO to encode json values. json ended up being a bit of a special use-case for two reasons:
  1. It can often be large (ie: lots of data)
  2. It has many characters that need to be escaped, like quotes and returns if the json is pretty printed.
The single largest factor to performance ended up being the use of the IsValidExpression function, which takes longer to run when more characters have been escaped by the Quote function. I've been too busy getting fmQBO ready to bother releasing my findings, but here is a copy of the # function that I'm using in fmQBO: https://gist.github.com/dansmith65/678e137357ba8499c249

However, like Jeremy said, dynamic typing is always going to have more overhead
than static typing. So, whenever I need to encode a json value that could potentially be very large (like all Invoices), I manually encode it (ie: static typing):

    "$name = " & Quote ( $value ) & " ;¶"

So far, I've found that dates, times, timestamps, and numbers are all dynamically typed pretty quickly, so I didn't bother to statically type most of them.

--
You received this message because you are subscribed to the Google Groups "FileMaker Development Standards" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fmstandards...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Cristos Lianides-Chin

unread,
Mar 20, 2015, 12:04:36 AM3/20/15
to fmsta...@googlegroups.com
Thanks, Dan -- I used the version from your gist and it's working great. I'm curious if you've encountered performance problems with any of the other standard #Parameters functions?

Daniel Smith

unread,
Mar 20, 2015, 12:55:28 AM3/20/15
to fmsta...@googlegroups.com
Similar changes could also be applied to #List. No other cf's can be sped up so drastically using the same method.

Cristos Lianides-Chin

unread,
Mar 23, 2015, 12:06:49 PM3/23/15
to fmsta...@googlegroups.com
I found that VerifyVariablesNotEmpty was causing a slowdown with this as its core code:
        IsValidExpression ( ~testExpression )
       
and Evaluate ( ~testExpression )

but runs faster when I take out the IsValidExpression() check and make it just:
        Evaluate( ~testExpression )

I'm using the modified version in a very limited scenario, so this might not be safe to do everywhere, but I thought it would be useful to collect the information here.

~ Cristos

Daniel Smith

unread,
Mar 23, 2015, 1:23:14 PM3/23/15
to filemakerstandards.org
Thanks for catching that one; I always forget about it since it isn't prefixed with #.

I made a quick revision to that function that should address the speed issue, but didn't test it much, yet: https://gist.github.com/dansmith65/b193236ed35f8d7b40da

I'm actually surprised that this function would ever perform that slowly since it's dealing with variables instead of quoted strings, like the other functions. Did you do any speed tests on it?  If so, do you mind comparing this new version and reporting your findings?

Cristos L-C

unread,
Mar 23, 2015, 1:31:15 PM3/23/15
to fmsta...@googlegroups.com
I haven't done any speed profiling yet...I'll see if I can do that this coming weekend.
Reply all
Reply to author
Forward
0 new messages