dim_STAT v.9.0 can't use Multi-Host Analyze for my add-on

42 views
Skip to first unread message

Pjotrs Ovcinnikovs

unread,
Jul 23, 2020, 3:36:59 PM7/23/20
to dim_STAT

Dear Dimitry,


I have tried new version of dim_STAT v.9.0-u20-04, I see a lot of improvements compare to v8, which I have used before, nice !

But I'm stuck with multi-host analysis. When I have created my new add-on, I don't see anymore multi-host settings. When collected my stats and choosing multihost analyze, I don't see any bookmarks to choose.

I understood that in v9 I have to create the bookmarks for each column for my add-on, but I don't know right way of doing it. User guide have old screenshots and not explaining this either.
Could you, please, advice, how to create multi-host graphs for customized add-ons ?

Thanks in advance,
Pjotr

Dimitri

unread,
Jul 24, 2020, 6:27:50 AM7/24/20
to dimstat
Hi Pjotr,

yes, since v9.0 the Multi-Host stats moved to "bookmarks" to allow much more better flexibility comparing to before (it's no more just a single value from your Add-On, but could be much more complex and advanced criteria ;-))

indeed, I don't recall now if I described anywhere how to process with it, but the steps are very simple :
  - you select whatever you want from your Add-On on Analyze page to get a graph as usual
  - when you see your graph on the next page, you'll also see a button proposing to save it as new Bookmark
  - near this button you'll see an option to create this Bookmark as Single-Host or Multi-Host
  - Single-Host option will be available all the time
  - while Multi-Host option will appear if the resulted graph is matching Multi-Host "criteria" ;-))
  - e.g. it should have at the end a single value to show on the graph, not many (as on Multi-Host graphs you can show only one stat value on the same time per graph for all hosts)..

let me know if it worked for you and if you need any more details, etc..

also, mind that the latest public CoreUpdate v.9.0-u20-05 (and there is v.9.0-u20-07 already in pipe ;-))
since v.9.0 you don't need to reinstall anything for upgrades, you just apply CoreUpdate now live(!) over existing dim_STAT v9 installation..

Rgds, -Dimitri



--
You received this message because you are subscribed to the Google Groups "dim_STAT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dimstat+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dimstat/2f934481-537e-4327-9edd-01648b2c3e25o%40googlegroups.com.

Pjotrs Ovcinnikovs

unread,
Jul 24, 2020, 10:35:59 AM7/24/20
to dim...@googlegroups.com
Many thanks, Dimitry !

Really appreciate your quick help. It seems, I was near to the solution.
Just to be 100% sure. if I'm using Linux sar command report, for example:

shell script sar.sh is printing 6 columns:
%user     %nice   %system   %iowait    %steal     %idle

to match the "criteria" for multi-host stats I have to create another shell script sar2.sh, which will print only %user column and create new add-on for it ?

Thanks for reminding me about the new version !
And may thanks for the great tool !

Cheers,
Pjotr

Dimitri

unread,
Jul 24, 2020, 11:16:24 AM7/24/20
to dimstat
Hi Pjotr,

sorry, seems like I made a confusion on your side ;-))
so, let me try to resolve it now :

- you don't need to create any new add-on, no
- just keep the same as it

let's use it as example here :
- so, you have "sar.sh" script printing you 6 values related to CPU activity
- in the past you needed to specify during Add-On creation which of values you want to see in Multi-Host
- since v9 you don't need to do anything like this, just create your Add-On, that's all ;-))

how now you "enable" your "sar.sh" values for Multi-Host :
- once some data are already collected from your sar.sh, go to Single-Host Analyze
- select the Collect having your sar.sh data and click on the button with your Add-On name
- now you'll be able to draw graphs for each of your 6 values
- so you can click now on "%user" value, get the graph for it => and then save as Multi-Host Bookmark !
- same for any other values as well
- this all was possible to do before, but on the Add-On creation step, right ?
- however, the advantage of the new approach is not only in allowing to provide a better description via Bookmark ;-))
- you can now present SUM of several values (for ex. %user + %system to be seen as a single value) => and get this as the new Multi-Host Bookmark metric as well
- etc. etc. etc; ;-))

e.g. as long as on the final graph you're showing a single metric value, you can always use it for Multi-Host !

well, this is much more simple to demo rather explain with words, so I have no idea if it'll be more clear now, or not.. -- so, don't hesitate to ask questions ;-))

Rgds, -Dimitri


Pjotrs Ovcinnikovs

unread,
Jul 24, 2020, 12:03:55 PM7/24/20
to dim...@googlegroups.com
Hi Dimitry,

now I got it :) thanks a lot :) it is logical, but I always wanted to choose more than 1 column in multi-host reports, therefore never saw it in the bookmarks list box.

One more question, while you are so kind to help me.... In v8 Dimstat server part was able to run for a long time.... When I started to use v9, I see that after short time, may be an hour, mysql is going down:

No mysqld pid file found. Looked for /apps/mysql/data/mysqld.pid.

When stopping dimstat server:

[root@test ADMIN]# ./dim_STAT-Server stop
================[ dim_STAT-Server: stop ]================
================[ Checking for ACTIVE collects... ]================
================[ MySQL Server is not running... ]================
I see that mysql is not running...
My question is : is it configuration issue or something missing ?

Another question I have is regarding multi-lines stats using awk. I have created shell script which is collecting pidtsat info about process threads. when I calling the script from command line, it is working fine. but when it is called from the dimstat, all parameters are shifting by 1.

Example:

[root@cpv-app-0 bin]# ./Pidstat.sh 8161 1
Engine-Worker-0 0.01 0.00 0.00 0.01 0
Engine-Worker-1 0.01 0.00 0.00 0.01 0
Engine-Worker-2 0.01 0.00 0.00 0.01 0
Engine-Worker-3 0.01 0.00 0.00 0.01 0
Engine-Worker-4 0.01 0.00 0.00 0.01 0
Engine-Worker-5 0.01 0.00 0.00 0.01 0
Engine-Worker-6 0.01 0.00 0.00 0.01 0
Engine-Worker-7 0.01 0.00 0.00 0.01 0
Engine-Worker-8 0.01 0.00 0.00 0.01 0
Engine-Worker-9 0.01 0.00 0.00 0.01 0

When calling from dimstat:
[root@cpv-app-0 bin]# ./STATcmd -h localhost -p 5000 -c "Pidstat 8161 1"
STAT *** OK CONNECTION 0 sec.

Pidstat.sh:

while true
do
pidstat -p $1  -t | awk '{if ($11 ~ "__Engine-Worker-") {print substr($11,4), $6,$7,$8,$9,$10; count++; if (count==10) {print "\n";count=0}}}'
sleep $2
done


But, when I change Pidstat.sh to:

while true
do
exec /usr/bin/pidstat -p $1  -t | awk '{print substr($10,4),$5,$6,$7,$8,$9}' | grep Engine-W
sleep $2
printf "\n"
done

it is working fine:

[root@cpv-app-0 bin]# ./STATcmd -h localhost -p 5000 -c "Pidstat 8161 1"
STAT *** OK CONNECTION 0 sec.
STAT *** OK COMMAND (cmd: Pidstat)
Engine-Worker-0 0.01 0.00 0.00 0.01 1
Engine-Worker-1 0.01 0.00 0.00 0.01 0
Engine-Worker-2 0.01 0.00 0.00 0.01 0
Engine-Worker-3 0.01 0.00 0.00 0.01 1
Engine-Worker-4 0.01 0.00 0.00 0.01 1
Engine-Worker-5 0.01 0.00 0.00 0.01 0
Engine-Worker-6 0.01 0.00 0.00 0.01 0
Engine-Worker-7 0.01 0.00 0.00 0.01 1
Engine-Worker-8 0.01 0.00 0.00 0.01 0
Engine-Worker-9 0.01 0.00 0.00 0.01 1

Many thanks for your effort !,
Pjotr

Dimitri

unread,
Jul 24, 2020, 1:37:55 PM7/24/20
to dimstat
Hi Pjotr,

to finish with Bookmarks => just to remind you there are now "Bookmark PRESETs" are available in v9 which is allowing you to group several Bookmarks under single PRESET button (single click to show them all) -- saving a lot of time if you're looking for similar things over and over, but also allowing to group several needed stats under a single name (PRESET) according to activity / system / workload / etc..

Now, regarding stopped MySQL Server -- this is totally abnormal.. May you check the output of your /apps/mysql/data/mysqld.log file ? -- is there anything about shutdown ?

also, on which OS you're running ? (I have dim_STAT running on Linux and Mac for many long weeks, and never see MySQL to be stopped by surprise, generally it's just rock solid (on Linux on my LAB host it has already over 300+ days of uptime on MySQL Server, still never stopped since then, maybe on the nex planned LAB power-off cycle only ;-))

While for your script story -- I'm rather confused..
what is working and what is not ?
specially that when you're using "exec" in your script it'll just fork to "pidstat" and never continue the expected loop..

Rgds, -Dimitri


Pjotrs Ovcinnikovs

unread,
Jul 24, 2020, 3:42:19 PM7/24/20
to dim...@googlegroups.com
Hi Dimitry,

1. thanks a lot for bookmarks PRESET hint. really helpful
2. regarding mysql. I completely agree with you. When I used v8, I had dimStat server running for ages without any problem.

in the mysql.log I can see "shutdown" event. this is the source of the problem, I guess.

200724 18:05:01 [Note] /apps/mysql/bin/mysqld: Normal shutdown

200724 18:05:01 [Note] Event Scheduler: Purging the queue. 0 events
200724 18:05:01  InnoDB: Starting shutdown...
200724 18:05:02  InnoDB: Shutdown completed; log sequence number 33785481
200724 18:05:02 [Note] /apps/mysql/bin/mysqld: Shutdown complete

200724 18:05:02 mysqld_safe mysqld from pid file /apps/mysql/data/mysqld.pid ended

Linux is RHEL 7.7:
[root@cpv-oam-0 ~]# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.7 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.7"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.7 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.7:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.7
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.7"

3. regarding the awk issue in the script. sorry, my bad explanation. let me explain.

I wrote a shell script to parse pidstat -p <pid> -t command, to get stats related to some threads.
image.png
It looked like:

#!/bin/bash

while true
do
/usr/bin/pidstat -p $1 -t | awk '{print substr($11,4),$6,$7,$8,$9,$10}' | grep Engine-W

sleep $2
printf "\n"
done

Result from shell call:
[root@cpv-app-0 bin]# ./Pidstat.sh 8161 1
Engine-Worker-0 0.01 0.00 0.00 0.01 1
Engine-Worker-1 0.01 0.00 0.00 0.01 1
Engine-Worker-2 0.01 0.00 0.00 0.01 1

Engine-Worker-3 0.01 0.00 0.00 0.01 0
Engine-Worker-4 0.01 0.00 0.00 0.01 1
Engine-Worker-5 0.01 0.00 0.00 0.01 0
Engine-Worker-6 0.01 0.00 0.00 0.01 0
Engine-Worker-7 0.01 0.00 0.00 0.01 0
Engine-Worker-8 0.01 0.00 0.00 0.01 0
Engine-Worker-9 0.01 0.00 0.00 0.01 1

Call from dimStat:
[root@cpv-app-0 bin]# ./STATcmd -h localhost -c "Pidstat 8161 1"

STAT *** OK CONNECTION 0 sec.
STAT *** OK COMMAND (cmd: Pidstat)
 0.00 0.00 0.01 1 |__Engine-Worker-0
 0.00 0.00 0.01 0 |__Engine-Worker-1
 0.00 0.00 0.01 0 |__Engine-Worker-2
 0.00 0.00 0.01 0 |__Engine-Worker-3
 0.00 0.00 0.01 1 |__Engine-Worker-4
 0.00 0.00 0.01 0 |__Engine-Worker-5
 0.00 0.00 0.01 0 |__Engine-Worker-6
 0.00 0.00 0.01 0 |__Engine-Worker-7
 0.00 0.00 0.01 0 |__Engine-Worker-8
 0.00 0.00 0.01 0 |__Engine-Worker-9

So, parameters were shifted by 1 in dimStat call. When I have changed script to:

while true
do

/usr/bin/pidstat -p $1 -t | awk '{print substr($10,4),$5,$6,$7,$8,$9}' | grep Engine-W
sleep $2
printf "\n"
done

Result was OK:

[root@cpv-app-0 bin]# ./STATcmd -h localhost -c "Pidstat 8161 1"

STAT *** OK CONNECTION 0 sec.
STAT *** OK COMMAND (cmd: Pidstat)
Engine-Worker-0 0.01 0.00 0.00 0.01 1
Engine-Worker-1 0.01 0.00 0.00 0.01 0
Engine-Worker-2 0.01 0.00 0.00 0.01 0
Engine-Worker-3 0.01 0.00 0.00 0.01 1
Engine-Worker-4 0.01 0.00 0.00 0.01 1
Engine-Worker-5 0.01 0.00 0.00 0.01 1
Engine-Worker-6 0.01 0.00 0.00 0.01 1

Engine-Worker-7 0.01 0.00 0.00 0.01 0
Engine-Worker-8 0.01 0.00 0.00 0.01 0
Engine-Worker-9 0.01 0.00 0.00 0.01 0


Thank you for your effort !
Cheers,
Pjotr

Dimitri

unread,
Jul 24, 2020, 4:53:20 PM7/24/20
to dimstat
Hi Pjotr,

regarding MySQL Server shutdown :
- look into mysqld.log in details
- if there was no any error reported, and this is a proper shutdown => then someone is sending kill signal to mysqld process ;-))
- check your system logs, etc.
- MySQL Server will never stop by itself (except if some errors were met)

while for your script :
- indeed, it's more clear now
- and you're hitting here a pretty common error which many people are doing while writing shell scripts..
- the main point : you always need to keep in mind the "local env." your script is using !
- to avoid any unexpected errors, mind to explicitly specify in your script your env. !
- for ex. by explicitly setting LANG=C and  LC_ALL=C you'll avoid any further potential mismatch in your script
- because the problem you're hitting is related to the first column, which is "time"
- and depending on locale you're using when executing the script it'll have or not have "AM" in output ;-))
- (which is giving you one column more or less in the line output ;-))
- while for other env. you'll have commas instead of dots in your numbers, which will generate tons of SQL errors on input..

All STAT-service env. and scripts are using  LANG=C and  LC_ALL=C by default to avoid such kinds of problems. May only advise you to do the same in your scripts as well to keep them rock solid.

Rgds,
-Dimitri



Pjotrs Ovcinnikovs

unread,
Jul 25, 2020, 9:11:35 AM7/25/20
to dim...@googlegroups.com
Hello Dimitry,

first of all, many thanks for your fantastic support !

You are absolutely right regarding mysql issue - I found cron job which was killing it:

messages-20200719:2020-07-17T13:10:06.625010+00:00 test /usr/sbin/KillIdleSessions[13352]: Sending SIGKILL to: Process (mysqld) /apps/mysql/bin/mysqld --defaults-file=/apps/mysql/my.conf --basedir=/apps/mysq; pid 5306; uid 1001; euid 1001 at Fri Jul 17 13:10:06 2020

regarding the shell script, thanks a lot for pointing to this problem ! your suggestion is really working out ! I was exhausted with the ideas. many thanks !

Have a nice weekend !
Cheers,
Pjotr

Dimitri

unread,
Jul 26, 2020, 6:14:35 AM7/26/20
to dimstat
Hi Pjotr,

You're welcome ;-))

also, I could not recall where I've explained the Multi-Host Bookmark feature, but in fact it was here yet 9 years ago :

as well many other new things in v9 are explained in the following articles :

(better to start from the bottom as the newer articles are going first ;-))

while all new articles will be available from here :
  - http://dimitrik.free.fr/blog/categories/cat_dim_stat.html

(I've switched blog posts management to a more simple and more efficient SW)

have fun reading ;-))

Rgds,
-Dimitri


Pjotrs Ovcinnikovs

unread,
Jul 27, 2020, 12:24:30 PM7/27/20
to dim...@googlegroups.com
Hi Dimitry,

many thanks for such useful information. I just recall that I already read some of these articles, but it was by bad reading that I didn't get how that multi-host graphs are based on single column attribute.

Today I stuck with another problem. Script is collecting GC pause stats, and values are given in seconds and quite low:

  8   09:15:14 2020-07-27 (#460)   0.00607 
  8   09:15:15 2020-07-27 (#461)   0.00587 
  8   09:15:16 2020-07-27 (#462)   0.00587 
  8   09:15:17 2020-07-27 (#463)   0.00587 
  8   09:15:18 2020-07-27 (#464)   0.00587 
  8   09:15:19 2020-07-27 (#465)   0.00615 
  8   09:15:20 2020-07-27 (#466)   0.00615 
  8   09:15:21 2020-07-27 (#467)   0.00615 
  8   09:15:22 2020-07-27 (#468)   0.00615

When generating the graph, scale is always 1.000 and curve is very hard to read:
image.png

Is this due to some limitations in scale values or this is possible to tune somehow ?

Thanks a lot for your brilliant support !

Cheers,
Pjotr

Dimitri

unread,
Jul 27, 2020, 4:09:01 PM7/27/20
to dimstat
Hi Pjotr,

the story will be always painful with too big or too small values.. -- you'll quickly feel PITA counting all these zeros (on the beginning or on the end, etc.)..

personally I'm doing the following :
- if very small values are expected, then use a small metric (like in your case use values counted in "ms" rather in "sec")
- same for very big (MB or KB instead of Bytes, and so on)..
- and if you don't know what to expect (e.g. it can vary from very small to very big) => then simply add more columns into output and represent the same value in different metrics (in your case for ex. usec, ms, sec.)
- and final problem after all : when your values can vary too much, even within a short time interval, so you'll be constantly lost with them.. => in this case I'm adding yet one more column with "power10" (or more) representation of the value, which will me allow to easily and clearly see the "tendency" in value "waves" going up and down, etc..

hope this helps ;-))

Rgds,
-Dimitri


Pjotrs Ovcinnikovs

unread,
Jul 28, 2020, 4:03:59 AM7/28/20
to dim...@googlegroups.com
Hi Dimiri,

thanks a lot for advice, as usually very helpful ! In my case, simple conversion seconds to milliseconds did work out.

Have a nice day !
Pjotr

Reply all
Reply to author
Forward
0 new messages