Maximum number of cores

318 views
Skip to first unread message

Susana Gonzalez

unread,
Jan 20, 2017, 2:20:21 AM1/20/17
to last...@googlegroups.com

Hi all,

What is the maximum number of cores that I can use in LAStools?

I have a batch of data where each tile is around 65,000,000 points (attached an example of a txt file from lasinfo), the size of each tile is around 360,000 KB.

I am using a computer with 40 logical processors and  64 GB of installed memory.


So the question is, how do I know when the software or computer is running out of memory? How many cores could I run for these big tiles in the computer?


I have this error as well, what does this mean?


ERROR: nrows= 20 and ncols=0 not supported by SRbufferInMemory


Thanks

Susana




info.txt

Terje Mathisen

unread,
Jan 20, 2017, 2:43:31 AM1/20/17
to last...@googlegroups.com
Susana Gonzalez wrote:
>
> Hi all,
>
> What is the maximum number of cores that I can use in LAStools?
>
> I have a batch of data where each tile is around 65,000,000 points
> (attached an example of a txt file from lasinfo), the size of each
> tile is around 360,000 KB.
>
> I am using a computer with 40 logical processors and 64 GB of
> installed memory.
>

I looked at your lasinfo results, this is 65M points in a 1000x1000 m
block, i.e. 65 points per square meter which is quite extraordinary: Is
this the result of a terrestial scan, or dense photo matching?

You have obviously licensed LAStools (good for you and good for Martin!
) so the number of points is OK, but if you have multiple such tiles
then you almost certainly want to retile them first with a small buffer
around each tile, since that avoids most boundary artifacts.
>
>
> So the question is, how do I know when the software or computer
> is running out of memory? How many cores could I run for these big
> tiles in the computer?
>

All of the LAStools binaries are compiled in 32-bit mode, so they can
probably never use more than 2-3 GB of memory effectively. If you run
out of RAM processing a single tile then you need to retile with a
smaller block size, I'd suggest something like 500x500 m with a 10 m buffer.
>
>
> I have this error as well, what does this mean?
>
>
> ERROR: nrows= 20 and ncols=0 not supported by SRbufferInMemory
>

ncols = 0 means that some part of the code have tried to create a
rectangle with zero width, possibly a rounding error somewhere which
goes in two different directions. I did note that the maximum
coordinates was exactly 1000.000 m higher than the minimum, you would
normally expect the maximum to be 999.99 or 999.999 if the mm resolution
is valid. If this is the problem then retileing will solve this as well.

Terje
>
>
> Thanks
>
> Susana
>
>
>
>
> --
> Download LAStools at
> http://lastools.org
> http://rapidlasso.com
> Be social with LAStools at
> http://facebook.com/LAStools
> http://twitter.com/LAStools
> http://linkedin.com/groups/LAStools-4408378
> Manage your settings at
> http://groups.google.com/group/lastools/subscribe


--
- <Terje.M...@tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Piotr

unread,
Jan 20, 2017, 8:35:47 PM1/20/17
to LAStools - efficient tools for LiDAR processing, Susana....@interpine.co.nz
Wouldn't the limiting factor here be the disk input/output?
What I usually do is I look at the disk usage while running lastools on several cores. I often read data from one hdd and write the results on a second one. I look at the usage of both and set the number of cores so that they are under 100%.

Best,
Piotr

Albert Godfrind

unread,
Jan 21, 2017, 10:45:20 PM1/21/17
to last...@googlegroups.com, Susana....@interpine.co.nz
There can be many limiting factors. Disk I/O is one of them as is the interconnect with storage and the interconnect between cluster nodes (once you go that way). This is why it is important that a hardware platform is balanced in such a way as to withstand the load. Otherwise you are limited by the slowest and weakest component. 

Among those factors is the way storage is organized. In modern systems that means striping and mirroring and the use of logical volume abstractions. Also the contention in the file system between concurrent threads reading and writing the same files. 

Ultimately heavy duty computing on large data sets ("big data" anyone ?) tends to happen nowadays on clustered architectures like Hadoop/HDFS or Spark (full in-memory processing) or also using no-sql databases. 

I am actually surprised not to see more lidar processing being shifted on big data platforms. 

Albert
--
Albert Godfrind
+33 6 09 97 27 23
Sent from my iPhone

Susana Gonzalez

unread,
Jan 24, 2017, 2:42:45 AM1/24/17
to last...@googlegroups.com

Hi,

The data is an airborne LiDAR dataset and I had retiled it using 500x500m with a 50m buffer.

I read the data from an input SDD hard drive and save the output in another SDD hard drive (each of them are 1TB) installed in the same computer.

I have followed Piotr suggestion of the usage of CPU and that solve my question, still that my computer has 40 cores to run this large dataset I must run LAStools around 35 cores (may be 30, so I can still work in other stuff).

Thanks for the help.

Susana

 

Susana Gonzalez       Interpine Group Ltd

DDI:  +64 7 350 3209 or Australia 0280113645 ext 722

 

Interpine Innovation is Shaping Today’s Forests with the Technology of Tomorrow

He rangahau tenei ra he hangarau apopo

Martin Isenburg

unread,
Jan 31, 2017, 5:28:02 AM1/31/17
to LAStools - efficient command line tools for LIDAR processing
Hello,

Just to summarize it from my perspective.

What is the maximum number of cores that I can use in LAStools?

I have a batch of data where each tile is around 65,000,000 points (attached an example of a txt file from lasinfo), the size of each tile is around 360,000 KB.

I am using a computer with 40 logical processors and  64 GB of installed memory.

I think you should be able to run LAStools with the '-cores 40' option but you need to make sure that each tile has no more than 20 million points in case you do operations that require triangulating the input in-core (e.g. lasheight, lasground, las2dem, las2iso, ...). Note that each of the BLAST extensions uses 3 cores already so limit your number accordingly.

However, as mentioned before, most likely it will be your disk that limits the processing so that more processors will not help when they are fighting for bandwidth. How to optimize disk use?

(1) lower the amount of data that is read and written for points use LAZ not LAS, for rasters use IMG/TIF/LAZ not ASC/BIL
(2) read from one disk and write to another disk. for a longer workflow exchange the roles of the two disks for each batch command. This can be a really big win for spinning harddrives. Not sure what difference it makes for SSD disks.
(3) consider using LASlayers (*) to cut down on the amount of output written to disk during lasnoise, lasground, lasheight, lasclassify, ...

So the question is, how do I know when the software or computer is running out of memory? How many cores could I run for these big tiles in the computer?

LAStools are 32 bit compiled. Each individual process will run out of memory when it reaches 2GB. Limiting tiles to 20 million (or less) avoids that (and has the potential to make processing more efficient as smaller tiles means LiDAR points are spatially closer to another and less time is spent on considering distant points during some of the computations and traversals.

I have this error as well, what does this mean?

ERROR: nrows= 20 and ncols=0 not supported by SRbufferInMemory

Never seen this before. I would need a sample data set that generates this.

Regards,

Martin @rapidlasso

Susana Gonzalez

unread,
Jan 8, 2019, 11:45:54 PM1/8/19
to last...@googlegroups.com

Hi Martin,

 

Using LAStools compiled with 64 bit. When will the software run out of memory?

 

What is the core limitation for the 64 bit compiled version? Is it again the 20 million points per tile?

 

What computer requirements would I need to run -cores 100 “safely”? If the computer exists, what are the hardware requirements?

 

Thanks

Susana

--

Martin Isenburg

unread,
Jan 9, 2019, 12:55:53 AM1/9/19
to LAStools - efficient command line tools for LIDAR processing
Hello,

a 64 executable can use *a lot* of memory. The limit will be the physical memory you actually have to prevent swapping / thrashing. I would still suggest to keep tiles reasonably small as many of the algorithms are getting slower (aka O(nlogn) run-time) when the number of input points increases. However, some algorithms can really shine with the 64 bit memory use, such as lasduplicate, lasboundary, lasthin, lasnoise, lascanopy, lastile, lasoverage, lasgrid, ... etc that keep a lot of mainly dormant temporary data structures around.

If you can continue with 10 - 20 million points per tile I suggest doing so. But you may also go up to 30 or 40 million per tile. I have not done performance studies. I expect at 100 million or higher some algorithms will degrade in performance significantly. Any algorithm that uses a temporary TIN and performs point location on the TIN such as lasground, lasheight, las2tin, las2dem, las2iso, lasthin (with '-adaptive 0.1' option), ... is likely to slow down when input tiles have that many points.

How many cores? Never more than you have. But also never more processes than you have psychical memory to accommodate in memory at the same time. Just watch the memory consumption in the TaskManager ...

Regards,

Martin

Evon Silvia

unread,
Jan 9, 2019, 6:22:33 PM1/9/19
to last...@googlegroups.com
As a tip from someone that's done some hack-job benchmarking of my own 64bit lidar applications, do keep in mind the different between a Hyperthreaded (HT) core and a true core, as in my experience the gain from heavy-duty processing using HT cores on top of real cores is almost negligible.

For example, the Intel i7 processor has 8 HT cores but only 4 physical cores. Thus you'll see significant gain going from 1 to 4 cores (unless you run into thrashing - search this forum for discussion on that), but minimal improvement going from 4 to 8 cores.

In a nutshell, HT cores exist to make your computer think there's more cores than they are, and they really shine when some processes are idle some of the time (e.g., video games, graphics-heavy work like CAD or videography) but not so much for CPU-heavy production (e.g., lidar processing). So if you're not seeing the gains you expected from adding more cores, research your CPU to see how many physical cores you have and experiment with that limit instead.

Evon
--
Quantum Geospatial Logo
Evon Silvia PLS
QSI Solutions Architect
ASPRS LAS Working Group Chair
1100 NE Circle Blvd, Suite 126, Corvallis, OR 97330
P: (541) 249-5818

Floris Groesz

unread,
Jan 10, 2019, 4:21:15 AM1/10/19
to LAStools - efficient tools for LiDAR processing
There is no answer on what the limiting factor of your computer will for running LAStools. it depends on what kind of tool you run.
triangulating tools require a lot of memory while in case of running a simple classification the I/O usually is the first bottle neck.

We run several tools on a server with 32 cores and 64 GB RAM. This server has a much better I/O than our normal work stations. We usually watch what the tool is doing and look at the resource monitor at the same time. you will quickly see what the bottleneck is for most processes.
using buffers on the fly and having many tiles in one folder is another typical cause of slowing down, even if the tiles have been indexed.

the balance between tile sizes and buffers is also complicated. if you take a too small tilesize compared to your needed buffer than you increase the amount of points to be processed unnecesarily.
Here you have to trust on Martin's tips and your own experience. properly testing all variables would take a lot of time.

The most important thing is to decide how many points you need. We always make the mistake of using all the points we have.
Thinning a little bit can make a huge difference.

Floris

Susana Gonzalez

unread,
Jan 10, 2019, 6:37:19 PM1/10/19
to last...@googlegroups.com

Hi again,

 

Thanks for all your comments.

 

Last year I processed 560,000 ha of LiDAR data, my current computer is a powerful one but looks like for future projects we need to start looking for solutions on the cloud. We need to process bigger areas and I will need more than 40 cores that I currently have.

 

Any one has experience of processing LiDAR data on a Web Service?

Nicolas Cadieux

unread,
Jan 10, 2019, 9:44:44 PM1/10/19
to last...@googlegroups.com
Hi,

I did consider that until I calculated the time lost unloading the data to the cloud.  I opted for m.2 Samsung drives.

Nicolas

Martin Isenburg

unread,
Jan 12, 2019, 9:54:53 PM1/12/19
to LAStools - efficient command line tools for LIDAR processing
Hello,

LAStools is perfectly suited for cloud employment. I have several clients that already have been using the cloud for many years to process LiDAR with LAStools. In fact some clients brought me on-site to "optimize" certain features of certain tools for easier cloud employment.

If you only rent machines in the cloud as you need them, the ease of installing LAStools makes cloud employment very easy. A big project arrives and you rent 10 machines with 8 physical cores in the cloud for the required processing time. Install LAStools on all 10 machines. Run your processes using your 80 cores as much as possible. And then you uninstall LAStools and give back the machines.

I/O is an issue here. You should really consider using the LASlayer functionality (which really reduces the O) or use piped processing. Also switch your point type to the new LAS 1.4 point type 6 or 7 because the native LAS 1.4 decompressor can now only decompresses those point attributes that you are really using (which can reduce the I especially for LAS 1.4 files with many additional attributes and RGBI colors). You may run into a few issues when trying to combine these optimizations, but you should at least try to use one of them.

Regards,

Martin

Dwight Crouse

unread,
Jan 12, 2019, 10:27:19 PM1/12/19
to last...@googlegroups.com
Hi Nicholas,

You can always ship hard drives to AWS and they will load the data to folders in your account for you.  I was doing 2TB drives a while back and it only took a day or two once the drives arrived at the local AWS site, in my case Seattle was the closest.  

You can also get drives shipped back if you have a large volume of data, usually for my processes my end results took much less storage space then the lidar/imagery so I could just download the results for client delivery.

Dwight

Ramón Sebastián Bustamante Ortega

unread,
Feb 7, 2019, 1:11:26 AM2/7/19
to LAStools - efficient tools for LiDAR processing
Hi Suzana

I am working with an Azure machine (64 cores - 128 MB ram) in cloud with Lastool software and my experience in process 200,000 hectares was good. Maybe, only problem was virtual machine does not have graphical card to see cloud points with Cloudcompare software for example.

regards from Chile
Reply all
Reply to author
Forward
0 new messages