Microsoft C4244 warnings ("'conversion' conversion from 'type1' to 'type2', possible loss of data")

131 views
Skip to first unread message

Joev Dubach

unread,
Jul 2, 2024, 6:42:42 PM7/2/24
to lua-l
I build Lua using Microsoft Visual Studio 2022 17.10.2, and my company uses a BinSkim policy checker that requires me to enable compiler warning C4244 ("'conversion' conversion from 'type1' to 'type2', possible loss of data").

To fix these warnings, any code with an implicit cast that reduces the size of an integral value has to be changed to use an explicit cast. The safety level of the explicit cast is the same as that of the implicit cast. For example, an int value of 257 might be unintentionally replaced by 1 when storing it in an lu_byte, and explicitly casting it to lu_byte before storing it doesn't change this. But I tend to agree that fixing these warnings by making the casts explicit will improve code quality, because when reading the revised code, it's more obvious that "this operation is risky".

Here is a patch against the released Lua 5.4.7 that would fix all these warnings:
>>>>
diff -U 0 lua-5.4.7/src/lapi.c lua-5.4.7-c4244/src/lapi.c
--- lua-5.4.7/src/lapi.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lapi.c 2024-07-02 17:51:39.076462000 -0400
@@ -1204 +1204 @@
-        g->genminormul = minormul;
+        g->genminormul = (lu_byte)minormul;
@@ -1220 +1220 @@
-        g->gcstepsize = stepsize;
+        g->gcstepsize = (lu_byte)stepsize;
@@ -1279 +1279 @@
-    L->ci->nresults = codeNresults(nresults);  /* mark it */
+    L->ci->nresults = (short)codeNresults(nresults);  /* mark it */
diff -U 0 lua-5.4.7/src/lauxlib.c lua-5.4.7-c4244/src/lauxlib.c
--- lua-5.4.7/src/lauxlib.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lauxlib.c 2024-07-02 17:51:39.424450100 -0400
@@ -813 +813 @@
-    lf.buff[lf.n++] = c;  /* 'c' is the first character of the stream */
+    lf.buff[lf.n++] = (char)c;  /* 'c' is the first character of the stream */
diff -U 0 lua-5.4.7/src/lcode.c lua-5.4.7-c4244/src/lcode.c
--- lua-5.4.7/src/lcode.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lcode.c 2024-07-02 17:51:39.611254900 -0400
@@ -342 +342 @@
-  f->lineinfo[pc] = linedif;
+  f->lineinfo[pc] = (ls_byte)linedif;
@@ -482 +482 @@
-  fs->freereg += n;
+  fs->freereg = (lu_byte)(fs->freereg + n);
@@ -1289,2 +1289,2 @@
-    t->u.ind.t = temp;  /* (can't do a direct assignment; values overlap) */
-    t->u.ind.idx = k->u.info;  /* literal short string */
+    t->u.ind.t = (lu_byte)temp;  /* (can't do a direct assignment; values overlap) */
+    t->u.ind.idx = (short)k->u.info;  /* literal short string */
@@ -1295 +1295 @@
-    t->u.ind.t = (t->k == VLOCAL) ? t->u.var.ridx: t->u.info;
+    t->u.ind.t = (lu_byte)((t->k == VLOCAL) ? t->u.var.ridx: t->u.info);
@@ -1297 +1297 @@
-      t->u.ind.idx = k->u.info;  /* literal short string */
+      t->u.ind.idx = (short)k->u.info;  /* literal short string */
@@ -1301 +1301 @@
-      t->u.ind.idx = cast_int(k->u.ival);  /* int. constant in proper range */
+      t->u.ind.idx = (short)cast_int(k->u.ival);  /* int. constant in proper range */
@@ -1305 +1305 @@
-      t->u.ind.idx = luaK_exp2anyreg(fs, k);  /* register */
+      t->u.ind.idx = (short)luaK_exp2anyreg(fs, k);  /* register */
@@ -1822 +1822 @@
-  fs->freereg = base + 1;  /* free registers with list values */
+  fs->freereg = (lu_byte)(base + 1);  /* free registers with list values */
diff -U 0 lua-5.4.7/src/ldebug.c lua-5.4.7-c4244/src/ldebug.c
--- lua-5.4.7/src/ldebug.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/ldebug.c 2024-07-02 17:51:39.800148600 -0400
@@ -921 +921 @@
-  lu_byte mask = L->hookmask;
+  lu_byte mask = (lu_byte)L->hookmask;
diff -U 0 lua-5.4.7/src/ldo.c lua-5.4.7-c4244/src/ldo.c
--- lua-5.4.7/src/ldo.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/ldo.c 2024-07-02 17:51:39.989720700 -0400
@@ -222 +222 @@
-  G(L)->gcstopem = oldgcstop;  /* restore emergency collection */
+  G(L)->gcstopem = (lu_byte)oldgcstop;  /* restore emergency collection */
@@ -342,2 +342,2 @@
-      ci->u2.transferinfo.ftransfer = ftransfer;
-      ci->u2.transferinfo.ntransfer = ntransfer;
+      ci->u2.transferinfo.ftransfer = (unsigned short)ftransfer;
+      ci->u2.transferinfo.ntransfer = (unsigned short)ntransfer;
@@ -506,2 +506,2 @@
-  ci->nresults = nret;
-  ci->callstatus = mask;
+  ci->nresults = (short)nret;
+  ci->callstatus = (unsigned short)mask;
@@ -828 +828 @@
-    setcistrecst(ci, status);  /* status to finish 'pcall' */
+    setcistrecst(ci, (short)status);  /* status to finish 'pcall' */
diff -U 0 lua-5.4.7/src/lgc.c lua-5.4.7-c4244/src/lgc.c
--- lua-5.4.7/src/lgc.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lgc.c 2024-07-02 17:51:40.256319100 -0400
@@ -263 +263 @@
-  o->tt = tt;
+  o->tt = (lu_byte)tt;
@@ -926 +926 @@
-    g->gcstp = oldgcstp;  /* restore state */
+    g->gcstp = (lu_byte)oldgcstp;  /* restore state */
@@ -1578 +1578 @@
-    g->gcstate = nextstate;
+    g->gcstate = (lu_byte)nextstate;
@@ -1733 +1733 @@
-  g->gcemergency = isemergency;  /* set flag */
+  g->gcemergency = (lu_byte)isemergency;  /* set flag */
diff -U 0 lua-5.4.7/src/liolib.c lua-5.4.7-c4244/src/liolib.c
--- lua-5.4.7/src/liolib.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/liolib.c 2024-07-02 17:51:40.494205800 -0400
@@ -447 +447 @@
-    rn->buff[rn->n++] = rn->c;  /* save current char */
+    rn->buff[rn->n++] = (char)rn->c;  /* save current char */
@@ -531 +531 @@
-      buff[i++] = c;  /* read up to end of line or buffer limit */
+      buff[i++] = (char)c;  /* read up to end of line or buffer limit */
@@ -536 +536 @@
-    luaL_addchar(&b, c);  /* add ending newline to result */
+    luaL_addchar(&b, (char)c);  /* add ending newline to result */
diff -U 0 lua-5.4.7/src/lparser.c lua-5.4.7-c4244/src/lparser.c
--- lua-5.4.7/src/lparser.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lparser.c 2024-07-02 17:51:40.698073400 -0400
@@ -269 +269 @@
-  e->u.var.vidx = vidx;
+  e->u.var.vidx = (unsigned short)vidx;
@@ -318,2 +318,2 @@
-    var->vd.ridx = reglevel++;
-    var->vd.pidx = registerlocalvar(ls, fs, var->vd.name);
+    var->vd.ridx = (lu_byte)reglevel++;
+    var->vd.pidx = (short)registerlocalvar(ls, fs, var->vd.name);
@@ -500 +500 @@
-    fs->freereg += needed;  /* remove extra values */
+    fs->freereg = (lu_byte)(fs->freereg + needed);  /* remove extra values */
@@ -683 +683 @@
-  fs->freereg = stklevel;  /* free registers */
+  fs->freereg = (lu_byte)stklevel;  /* free registers */
@@ -864 +864 @@
-  fs->freereg = reg;  /* free registers */
+  fs->freereg = (lu_byte)reg;  /* free registers */
@@ -1067,2 +1067,3 @@
-  fs->freereg = base+1;  /* call removes function and arguments and leaves
-                            one result (unless changed later) */
+  fs->freereg = (lu_byte)(base+1);  /* call removes function and
+                                       arguments and leaves one result
+                                       (unless changed later) */
@@ -1341 +1342 @@
-          lh->v.u.ind.t = extra;  /* assignment will use safe copy */
+          lh->v.u.ind.t = (lu_byte)extra;  /* assignment will use safe copy */
@@ -1347 +1348 @@
-          lh->v.u.ind.t = extra;  /* assignment will use safe copy */
+          lh->v.u.ind.t = (lu_byte)extra;  /* assignment will use safe copy */
@@ -1353 +1354 @@
-          lh->v.u.ind.idx = extra;  /* previous assignment will use safe copy */
+          lh->v.u.ind.idx = (short)extra;  /* previous assignment will use safe copy */
@@ -1738 +1739 @@
-    getlocalvardesc(fs, vidx)->vd.kind = kind;
+    getlocalvardesc(fs, vidx)->vd.kind = (lu_byte)kind;
@@ -1913 +1914 @@
-  ls->fs->freereg = luaY_nvarstack(ls->fs);  /* free registers */
+  ls->fs->freereg = (lu_byte)luaY_nvarstack(ls->fs);  /* free registers */
diff -U 0 lua-5.4.7/src/lstring.c lua-5.4.7-c4244/src/lstring.c
--- lua-5.4.7/src/lstring.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lstring.c 2024-07-02 17:51:40.902147300 -0400
@@ -268 +268 @@
-  u->nuvalue = nuvalue;
+  u->nuvalue = (unsigned short)nuvalue;
diff -U 0 lua-5.4.7/src/lstrlib.c lua-5.4.7-c4244/src/lstrlib.c
--- lua-5.4.7/src/lstrlib.c 2024-06-13 18:15:09.000000000 -0400
+++ lua-5.4.7-c4244/src/lstrlib.c 2024-07-02 17:51:41.657295200 -0400
@@ -131 +131 @@
-    p[i] = tolower(uchar(s[i]));
+    p[i] = (char)tolower(uchar(s[i]));
@@ -144 +144 @@
-    p[i] = toupper(uchar(s[i]));
+    p[i] = (char)toupper(uchar(s[i]));
@@ -541 +541 @@
-  ms->level = level+1;
+  ms->level = (unsigned char)(level+1);
@@ -1019 +1019 @@
-  buff[n] = (d < 10 ? d + '0' : d - 10 + 'a');  /* add to buffer */
+  buff[n] = (char)(d < 10 ? d + '0' : d - 10 + 'a');  /* add to buffer */
@@ -1062 +1062 @@
-      buff[i] = toupper(uchar(buff[i]));
+      buff[i] = (char)toupper(uchar(buff[i]));
<<<<

I can also provide this patch as a file attachment if Google Groups screws up the formatting somehow.

Thanks,
Joev Dubach

bil til

unread,
Jul 3, 2024, 10:38:16 AM7/3/24
to lu...@googlegroups.com
thank you, this is interesting proposal I think ... just I am not so
sure whether it is wise to use (char) for such "precision concerned"
applications.

At least in smaller systems with possibly restricted RAM (mainy
microcontrollers I think) it is common that the Compiler supports an
"easy selection possibility" whether chars should be signed or
unsigned (e. g. one of 10-20 "main" selection checkboxes for Keil ARM
CC).

Unsigned chars are typically the choice if your software uses much
string processing, here often programmers want bytes in unsigned form.

Signed chars are nice, if you typedef BOOL type to chars - if char
then is signed, you can also support negative error codes to
"accompany BOOL variables".

So better anywhere in your code instead of (char) use (signed char),
or create a typedef (schar)
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/lua-l/7a6beac7-b15c-492b-9bd6-a8c6d2ac8241n%40googlegroups.com.

Joev Dubach

unread,
Jul 3, 2024, 1:59:47 PM7/3/24
to lu...@googlegroups.com
These casts are all to the type being stored, which is the safe and clear thing to do to make the implicit cast explicit. E.g. if the effective existing code in the Lua implementation C code was:
int x = 257;
char y = x;
…the new code is:
int x = 257;
char y = (char)x;

If the existing code had been:
int x = 257;
signed char y = x;
…the new code would have been:
int x = 257;
signed char y = (signed char)x;

Joev

You received this message because you are subscribed to a topic in the Google Groups "lua-l" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lua-l/DX2g1mXpCmc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lua-l+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lua-l/CAOnDXmGjGxjE%2ByQkx-%3DJgtU-agFXR8TSNwY_j%2BWJV2hOb8bkAw%40mail.gmail.com.

Philippe Verdy

unread,
Jul 3, 2024, 2:13:33 PM7/3/24
to lu...@googlegroups.com
This is not a "fix", but a deliberate attempt to hide the loss of precision. A real fix would use target variables with the same size/range as the source. In general it's a bad idea to put forced typecasts everywhere, unelss you know really that this is valid and a value change CANNOT occur, or is explicitly desired and commented (e.g. with some bit-shifting operations). If you are not sure of these conditions (and there's no prior assumption checks performed in the code), it's best to leave these warnings as is, until someone review the validation of source values. The C/C++ compilers do not emit these warnings for bad reasons !

Joev Dubach

unread,
Jul 4, 2024, 8:13:08 AM7/4/24
to lu...@googlegroups.com
It doesn’t hide the loss of precision, it makes it explicit and visible to the reader. In software development, you sometimes have to make one fix (make the existing problem obvious) before later making another fix (use different source/target types to fix the problem).

Of course, if someone wanted to contribute code changes to make both fixes at once, I wouldn’t object.

Joev

Philippe Verdy

unread,
Jul 4, 2024, 8:47:02 AM7/4/24
to lu...@googlegroups.com
Explicit typecast is not explicit about the intent. It's OK only if there are assertion check at least in a debug version or explicit value range tests.

If you're not sure, it's best to include an assert and see what happens at run time with a debug version.

For me those warnings emitted by a compiler are more explicit if there's nothing else in the code and not even any comment

bil til

unread,
Jul 4, 2024, 9:03:57 AM7/4/24
to lu...@googlegroups.com
Usually indeed better to hide casts in macros, I would agree.

But e. g. for LOBYTE(i) / LOWORD(i) the macros
#define LOBYTE(i) (unsigned char)(i)
#define LOWORD(i) (unsigned short)(i)

are perfectly fine and very efficient programming style to my feelings... .

(doing such "bit stripping" with signed values might e. g. be somehow
strange / questionable... as the sign bit in such an int variable is
the MOST significant bit, so bit number 7 or 15 are not really
relevant / possibly will lead to strange sign inversion... therefore
typically such macros should NOT be defined also in signed version...
if you do such casting direclty in the "main code" without using
macros, you might run into such strange sign inversion problems by
some "accidential" "mis-casts" quite easily).

sur-behoffski

unread,
Jul 4, 2024, 8:03:38 PM7/4/24
to lu...@googlegroups.com
On 2024-07-04 22:33, bil til wrote:
> [...]
> But e. g. for LOBYTE(i) / LOWORD(i) the macros
> #define LOBYTE(i) (unsigned char)(i)
> #define LOWORD(i) (unsigned short)(i)
> [...]

By definition, sizeof(char) == 1. However, I've worked in C
environments on a DSP (Harvard architecture), where CHAR_BIT
is 32. In such cases, (i & 0xff) might be intended as part
of the conversion cast. [Misaligned accesses cause an
exception.]

The original "char is int" comes from in-band signalling,
where, in addition to all values 0-255, a function may return
a negative value (e.g. -1) plus error description in errno.
You need to convince the compiler (and further readers of the
code) that this in-band signalling has been stripped out.

To make things worse, a signed char may be interpreted as
-128..127 (0x80..0x7f), so a change from signed to unsigned
might be needed before a length extension, e.g. 8bit-to-32bit.

Moving out a bit further, a 32-bit char is quite different
where Unicode is involved. Values 0x00 to 0x7f are ASCII;
the number of octets to represent even this range this varies
(e.g. one octet in UTF-8, versus two octets, together with
endianness issues, in UTF-16). See e.g. Wikipedia's
"Comparison of Unicode encodings" for more gotchas.

In general, be careful when dealing with the C89 "char" versus
some higher-level "char" abstraction that the program may be
dealing with.

cheers, s-b etc
Reply all
Reply to author
Forward
0 new messages