TTK use-case for commodity business data

63 views
Skip to first unread message

nagesh danturti

unread,
Jan 27, 2024, 12:49:41 AMJan 27
to ttk-users
Hi

I represent a company called Wassching Group from Netherlands and we recently started a Data Topology company called Wingenium (wingenium.com) that aims to commoditize TDA (Persistent Homology and Mapper algorithms) to common business data with a desire to extract behaviors by creating Topological signatures and use them for classification purposes in (near)real-time.

 

I have followed Paraview+TTK efforts in modeling point-clouds to extract features through persistent diagrams. However, I am struggling to get a simple 20-column CSV data (with categorical features) into Paraview-TTK pipelines successfully. Despite converting the categorical features to valid float points, I am still struggling to understand the pipeline that could get me from a simple CSV à Morse-Smale complex or Reeb graphs or Persistence Diagrams to detect shapes and features.

 

“Could anyone please help me out by pointing out what the best practice here is and how does one go about use commonly available business data up until the point where one could start working with Paraview-TTK pipelines?”

 

I would highly appreciate your help/advice and the gesture.


Thank you


Kind regards 

Nagesh Danturti

wingenium.com


Julien Tierny

unread,
Jan 27, 2024, 1:32:38 AMJan 27
to ttk-users, nagesh danturti
Dear Nagesh,

thanks for your email and your interest in TTK.

> Despite converting the categorical features to
> valid float points, I am still struggling to understand the pipeline that
> could get me from a simple CSV à Morse-Smale complex or Reeb graphs or
> Persistence Diagrams to detect shapes and features.
First off, TTK has been specifically designed to analyze low dimensional data (i.e. 1D, 2D, 3D, as found in imaging for instance). I'd say that this is where it shines the most at the moment.

That being said, TTK can also handle high-dimensional point cloud data.
However, it will only model the first 3 intrinsic dimensions of your object (possibly living in a space of arbitrarily high dimensions).

For instance, you can build a Rips complex of a high-dimensional point cloud, but it will have simplices of dimensions up to 3, which will allow you, for example, to compute persistence diagrams for the homology groups of dimensions 0 (connected components), 1 (cycles) and 2 (voids).

Then, once you have a Rips complex computed from your (high-dimensional) point cloud data and a scalar function defined on it (the Rips complex filter can compute a few representative scalar descriptors), all the features from TTK are available, like critical points, persistence diagrams, persistent generators, Reeb graphs, Morse-Smale complexes, etc. (certain features, such as topological simplification assume a manifold domain though).

Now, to put pipelines together (i.e. combining available algorithms in a single analysis), you may want to have a look a the "TTK Online Example Database" (https://topology-tool-kit.github.io/examples/index.html), which documents pipeline examples.

For instance, you may want to have a look at the following examples:
- https://topology-tool-kit.github.io/examples/persistentGenerators_householdAnalysis/
- https://topology-tool-kit.github.io/examples/persistentGenerators_periodicPicture/

Also, you have the possibility of pre-reducing the dimensionality of your data (prior to using the above TTK features), to ease visual exploration.
For this, you may want to check out this dimensionality reduction method (which preserves the 0-dimensional persistent homology group): https://topology-tool-kit.github.io/examples/topoMapTeaser/

I hope this helps.

Thanks for letting us know if you need any further information.

Best regards,
--
Dr Julien Tierny
CNRS Researcher
Sorbonne Universite
https://julien-tierny.github.io/

nagesh danturti

unread,
Jan 27, 2024, 6:34:11 AMJan 27
to ttk-users
Dear Tierny

Much appreciated and many thanks for your prompt response! 

My use-case is primarily into high-dimensional data coming from large-scale financial transactions (from a financial institution) with over 75-100 dimensions of financial parameters (columns) for over millions of transactions (rows). The use-case here is to extract and classify "behavior" (including anomalies, outliers, aberrations) to be used later to pre-empt any un-wanted behavior to occur. 

In such a use-case, what's the best approach/methodology, using TTK tooling, that you'd advise? I am aware that I am trying to use TTK as a hammer, but I am also aware that I may not see any "discernible" topology and TTK may not be completely useful here. But please give me your best understanding if I am hitting in the dark.

Thanks once again

Kind regards
Nagesh Danturti

nagesh danturti

unread,
Feb 19, 2024, 8:44:34 AMFeb 19
to ttk-users
Dear all,

I face this issue on a custom build Paraview docker image with ttk extension. Anyone face this problem! (attached screenshot)

I have ensured that the base linux image installs the qhull (library) before building paraview image and subsequently the ttk image. But at runtime, it still complains of qhull being unavailable. Any suggestions? 

Many thanks and much appreciated.

Kind regards
Nagesh Danturti
Dockerfile
qhull.png

Christoph Garth

unread,
Feb 19, 2024, 9:19:34 AMFeb 19
to ttk-...@googlegroups.com
Dear Nagesh,

thanks for your inquiry. 

As a quick check, are the following fulfilled:

1) You are building a Docker image of a reasonably recent TTK version (TTK_ENABLE_QHULL can be configured)
2) the libqhull-dev package is installed before the TTK build
3) TTK_ENABLE_QHULL is set to ON

This should be sufficient to include qhull in the plugin within the Docker image. 

Please don’t hesitate if this is not easy for you to ascertain.

Cheers
Christoph

------------------------------
Prof. Dr. Christoph Garth
University of Kaiserslautern
Scientific Visualization Group

mail: Postfach 3049, 67653 Kaiserslautern, Germany
phone: +49 (0)631 205 3800
fax: +49 (0)631 205 3270
web: http://vis.uni-kl.de



--
You received this message because you are subscribed to the Google Groups "ttk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ttk-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ttk-users/2cd86927-6eaa-45b7-a591-390d3f5e8371n%40googlegroups.com.
<Dockerfile><qhull.png>

nagesh danturti

unread,
Feb 19, 2024, 9:32:37 AMFeb 19
to ttk-...@googlegroups.com, ga...@rptu.de

Dear Christoph


Much appreciated for your prompt response!!!

 

On your questions:

 

  1. I am using the ttk-dev (because I want to leverage TopoMap dimensionality reduction technique for my use-case). I believe from the github that this release (ttk-dev) is the only release that contains the TopoMap libraries. Please correct me if I am mistaken.

 

  1. Yes, I have explicitly instructed in my docker file for the base linux image (as shown below and also attached) to install “libhull-dev”

 

# install base development env

RUN apt-get install --no-install-recommends -yqq \

    build-essential \

    ninja-build \

    cmake \

    dlocate \

    file \

    curl \

    ccache \

    libboost-system-dev \

    libeigen3-dev \

    libgraphviz-dev \

    libosmesa-dev \

    libopenmpi-dev \

    libsqlite3-dev \

    libwebsocketpp-dev \

    graphviz \

    zlib1g-dev \

    libqhull-dev \

    dpkg-dev

 

 

  1. About TTK_ENABLE_QHULL option: I see, you must be referring to the ttk.sh file, correct? I did not set the option there. Can you please confirm if the script below (taken from ttk.sh) is where I need to enable QHULL option, like shown here below:

 

#! /bin/bash

set -e

 

require-pkgs \

    build-essential         \

    cmake                   \

    curl                    \

    libboost-system-dev     \

    libcgns-dev             \

    libeigen3-dev           \

    libexpat1-dev           \

    libfreetype6-dev        \

    libhdf5-dev             \

    libjpeg-dev             \

    libjsoncpp-dev          \

    liblz4-dev              \

    liblzma-dev             \

    libnetcdf-cxx-legacy-dev\

    libnetcdf-dev           \

    libogg-dev              \

    libpng-dev              \

    libprotobuf-dev         \

    libpugixml-dev          \

    libsqlite3-dev          \

    libgraphviz-dev     \

    libtheora-dev           \

    libtiff-dev             \

    libxml2-dev             \

    ninja-build             \

    protobuf-compiler       \

    python3-dev             \

    python3-numpy-dev       \

    libqhull-dev            \

    zlib1g-dev

   

if [ -n "${DEV}" ]; then

        #echo "DEVELOPER MODE"

        exit

fi

 

# get source code

(curl -kL https://github.com/topology-tool-kit/ttk/archive/${TTK_VERSION}.tar.gz | tar zx --strip-components 1) ||

(curl -kL https://github.com/topology-tool-kit/ttk/archive/v${TTK_VERSION}.tar.gz | tar zx --strip-components 1)

 

# actually compile

cmake-default \

    -DTTK_BUILD_DOCUMENTATION=OFF \

    -DTTK_BUILD_PARAVIEW_PLUGINS=ON \

    -DTTK_BUILD_STANDALONE_APPS=OFF \

    -DTTK_BUILD_VTK_WRAPPERS=ON \

    -DTTK_BUILD_VTK_PYTHON_MODULE=OFF \

    -DTTK_ENABLE_DOUBLE_TEMPLATING=OFF \

    -DTTK_ENABLE_CPU_OPTIMIZATION=OFF \

    -DTTK_ENABLE_OPENMP=ON \

    -DTTK_ENABLE_KAMIKAZE=ON \

    -DTTK_ENABLE_QHULL=ON \

    ..

 

# call Ninja manually to ignore duplicate targets

# cmake --build .

 

# ninja -w dupbuild=warn install

# cmake --install .

 

# popd

 

 

Again many thanks for helping out.

 

Kind regards
Nagesh Danturti

+31-6-41211780

 

--
You received this message because you are subscribed to a topic in the Google Groups "ttk-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ttk-users/0v3WjcxYL0c/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ttk-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ttk-users/A8AABD68-A39C-4514-BE9B-67D609305C06%40rptu.de.

Dockerfile
ttk.sh

Julien Tierny

unread,
Feb 21, 2024, 4:42:51 AMFeb 21
to ttk-users, nagesh danturti
Dear Nagesh,

thanks for your message.

> I have ensured that the base linux image installs the qhull (library)
> before building paraview image and subsequently the ttk image. But at
> runtime, it still complains of qhull being unavailable. Any suggestions?
Note that this is a warning only.
If qhull is not found, boost will be used instead (for convex hull computations).
The warning just reports that we have observed bugs in boost's implementation of convex hulls, but that's only for very few cases. In principle, you should be good in general.

Best,
--
Dr Julien Tierny
CNRS Researcher
Sorbonne Universite
https://julien-tierny.github.io/

nagesh danturti

unread,
Feb 26, 2024, 5:01:26 AMFeb 26
to ttk-users
Dear Julien

Thanks for your response. I do get an error, when I perform a TopoMap dim reduction (as shown in the screenshot attached)

2024-02-26 10:57:55 [Common] Welcome!
2024-02-26 10:58:04 [DimensionReduction] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2024-02-26 10:58:04 [DimensionReduction] Using backend `TopoMap (IEEE VIS 2020)`
2024-02-26 10:58:04 [TopoMap] Using Boost for convex hulls.
2024-02-26 10:58:04 [TopoMap] Input data: 11818 points (4 dimensions).
2024-02-26 10:58:06 [DimensionReduction] Computed TopoMap ........................ [2.594s|10T|100%]
2024-02-26 10:58:04 [TopoMap] [WARNING] Qhull was enabled but it is not installed or was not found.
2024-02-26 10:58:04 [TopoMap] [WARNING] Defaulting to Boost support instead.
2024-02-26 10:58:04 [TopoMap] [WARNING] Bugs have been reported in Boost's implementation.
2024-02-26 10:58:04 [TopoMap] [WARNING] Consider enabling Qhull instead.
2024-02-26 10:58:06 [TopoMap] [ERROR] No valid angle was found, due to errors in the convex hull computation. Please consider using Qhull instead of Boost. Aborting.

Please advise

Kind regards
Nagesh



error.png
Reply all
Reply to author
Forward
0 new messages