On 2015–02–16, at 2:38 AM, tdh...@gmail.com wrote:I propose a method for including an external file in a .cpp file, but instead of copy/pasting code as #include does, it will create a char array containing the file contents.
Apologies if this has been discussed before. This is something I've wanted in C++ for ages and it seems trivial
# OverviewI propose a method for including an external file in a .cpp file, but instead of copy/pasting code as #include does, it will create a char array containing the file contents.# Motivation
People have wanted something like this for decades ...
The lack of a standard built in, cross platform way to include these has lead to hacks like the XPM image format, and recently this method using GCC assembly:
#load "shader.vert" main_shader main_shader_len... It's not perfect and all-encompassing, but it is very simple and way better than anything we have now.
I find this useless specially in embedded environments since there should be some processing of the binary data anyway, either before building the application (in which case the use case goes back to the original "problem" since an external tool should be invoked by the build system) or later once the application is running thus consuming time and storing the result in RAM,
If I'm mistaken, Please show use cases where embedding a binary file exactly "as is" and without requiring and additional processing.
Maybe the Key question is: what would be the sources of those files to be embedded?
Daniel.
--
Hi,
Thanks everyone for the feedback! I'm going to summarise it here and give responses.
* Use std::array or iostream.
Nice idea. Only reason not to do this is it makes things more complicated, and I thought this proposal might be useful to the C folk too. Std::array is elegant though, and good point about memory residence.
* No need for a size term - just use sizeof() on the array.
That is a nice idea in theory, but arrays decompose into pointers at the drop of a hat, and then you lose your size information, e.g. when passed to functions.
* What's wrong with incbin, using external tools etc?
To be clear, this doesn't let you do anything new, it just makes it easier. There are already tons of things in C++ that are just conveniences so that isn't a real objection.
* Use a proper resource system!
That's overkill for many things, and is often not available.
* Don't see a need for this.
I've given clear examples of where there is definitely a need. The existence of incbin proves it. Incbin has 77 stars and 5 forks in less than two weeks. As Matthew Woehlke mentioned it is easy to find code that would benefit from this.
On 2015–02–17, at 8:46 AM, Thiago Macieira <thi...@macieira.org> wrote:a) why would you store the array into anything except a static const array or
a template type that is equivalent to that? If the template type takes an
array by reference, it doesn't matter if the string was "foo" or
{'f', 'o', 'o', 0}
b) why would the underlying element type be anything other than char?
Okay, I'm confused... above you say "it is a tremendously bad idea to
bake any resource into a binary", but here you are suggesting exactly that?
Who said anything about semantic checking?
*snip* On the other hand, it would be extremely convenient for very simple applications where quality of code is a lesser concern. More importantly however I'm confident you could build on such a mechanism to provide an effective resource system *without the need for additional tools* that would not have problems here, e.g. having a source file that is *just* resources and some minimal code (maybe just exporting the symbols) to make them available to other TU's
Does it make a difference if it's a separate file or the same as the executable
if the only way to get anything changed on the device is to flash it with a new
image?
Please don't generalise: what might be a bad idea for some scenarios may be
perfectly acceptable for others.
*snip*
Can anyone think of a better syntax?Cheers,Tim
[[from_file("shader.vert")]] // Perhaps use the same lookup rules as #include
extern const char src[]; // Size is computed when sizeof() is applied: "shader.vert" is loaded and measured. Linker deals with the data later...
On 2015–02–17, at 12:43 PM, Thiago Macieira <thi...@macieira.org> wrote:The problem is that the file containing binary data may also be intended to be
portable, so it may contain data in a specific endianness.
When compiled to a
target with a different endianness, the compiler could be expected to swap
things around.
Which is why I am saying this feature, if adopted, should be restricted to 1-
byte entities. That also resolves the question of whether to return a
character literal or an initializer list: since they are the same, a character
literal is easier to understand.
On 2015–02–17, at 12:45 PM, Thiago Macieira <thi...@macieira.org> wrote:Does it make a difference if it's a separate file or the same as the executable
if the only way to get anything changed on the device is to flash it with a new
image?
How about attributes?
[[from_file("shader.vert")]] // Perhaps use the same lookup rules as #include
extern const char src[]; // Size is computed when sizeof() is applied: "shader.vert" is loaded and measured. Linker deals with the data later...
Yep. So they can also ignore [[dllimport]] (if MS ever does this...) and other things that would otherwise prevent compilation.
Other attributes that break in terrible ways when ignored include all the "support" ever implemented for parsing __declspec() (which includes __declspec(align) and __declspec(property)).
A toolchain that ignores [[from_file("")]] would simply fail to compile it (can't compute array size, can't link).
It follows that source code expecting this feature would fail to compile, as expected.
*snip*
My main suggestion is that C++ should provide function that returns a named resource as a std::istream. Then the programmer chooses between text and platform-specific endianness, and the platform build system chooses between storage in the executable, in a separate file, in a file aggregating various resources, etc.
*snip*
On Monday 16 February 2015 22:55:16 Chris Gary wrote:
*snip*
Those are not C++ standard attributes. As extensions, they are out of scope.
All current C++ standard attributes are ignorable if the compiler does not
implement that feature. That's why alignas is a keyword, not an attribute.
GLint (STDCALL_OR_NIL
*glDoSomething)(...);
#ifdef _WIN32
#define STDCALL_OR_NIL [[msvc::stdcall]] // whatever this might actually look like
#else
#define STDCALL_OR_NIL
#endif
STDCALL_OR_NIL
GLint (*glDoSomething)(...);
On 2015–02–17, at 3:08 PM, Thiago Macieira <thi...@macieira.org> wrote:Your suggestion is completely orthogonal to the suggestion from the OP. It
would be a nice feature to have, but very difficult to implement source files
without #load or without a helper code generator (like Qt's rcc).
Which is why I am saying this feature, if adopted, should be restricted to
1- byte entities. That also resolves the question of whether to return a
character literal or an initializer list: since they are the same, a
character literal is easier to understand.
I’m not sure what you mean by “return a character literal.”
I was talking about #load. The product of that should be a very long character
literal, which you can store in an array or manipulate via templates or
constexpr functions.
> Mainly to keep calling convention specifiers and other paraphernalia in a predictable place.
Nope. Although it’s often advisable for vendors to group proprietary keywords together with attributes where possible, they’re not allowed to turn keywords into attributes where they convey necessary meaning.
> A minor change, but a definite improvement in usability: Just put all the random compiler-specific goodies in front of a declaration.
Vendors don’t need the standard to tell them where to add extensions.
[[glsl::fragment_source]]
const char *frag_src = R"frag(
main()
{
gl_FragColor = gl_Color;
}
)frag";
[[from_file("shader.vert")]]
extern const char str[];
const char *src =
#load "things.stuff" // This HAS to be on its own line
;
#load_into src "stuff.things" // Coupling with the next stage: "src" is a pp-token here
const char src[]; // what does my decl actually look like?
#pragma paste_file(src, "stuff.things")
const char src[]; // I still don't know what my decl should look like!
On 2015–02–17, at 4:38 PM, Chris Gary <cgar...@gmail.com> wrote:Can you clarify that last bit? N2761 seems to suggest them as a replacement for all forms of __attribute__ and __declspec (much ado about how they are equivalent to GCC's __attribute__).
Even mentioning alignment in the "yes, do this" bullet list of suggestions.
Well, now they have been given very strong advice indeed! Not that they'll listen…
I'm probably just dreaming with that example, though. The calling convention should actually go right next to the pointer, now that I think about it.
Avoiding further digression: I already agree with your suggestion that an opaque named-resource-stream would be the most feasible approach to this problem.
#pragma paste_file(src, "stuff.things")
const char src[]; // I still don't know what my decl should look like!
On 2015–02–17, at 5:43 PM, David Krauss <pot...@gmail.com> wrote:
My suggestion almost avoids touching the core language, but I said the standard function should take a compile-time string.
On Wednesday 18 February 2015 00:17:15 Magnus Fromreide wrote:
> char greeting[] = {
> #load "strings/hello.txt"
> }
>
> otherGreeting<
> #load "strings/hello.txt"
>
> > bar;
>
> thirdGreeting(
> #load "strings/hello.txt"
> );
>
> and finally there is
You can replace those with a load into a constexpr char variable and pass that
to the template or function call.
We're not talking about loading a file and interpreting it as C++ source code.
We have #include for that already.
CREATE TABLE my_table( cod int(10), nm varchar(30) )
WEIRED_PARSE_MACRO(#load "..\..\..\var\mysql_data\my_table.frm");
struct my_table{
int cod;
std::string nm;
};
What we *are* missing is a *portable* way to go from binary data on disk
to binary data in code. xxd *is not* that way. IMO this is a
sufficiently common problem that esoteric tools should not be required
to solve it. It should be built into the compiler. (Also IMO, process
invocation should *absolutely not* be built into the compiler.)
#define STR(X) #X
const char data[] = STR(
#include "file.txt"
);
// frag.glsl
#version 330
uniform vec4 outColor;
void main() { outColor = vec4(1, 0, 0, 1); }
// C++ code
#define STR(X) #
const char data[] = STR(
#include "frag.glsl"
);
On Wednesday 18 February 2015 11:02:41 Dale Weiler wrote:
> I don't care what the syntax is, but I can assure you that the language
> shouldn't interpret the data as anything but a raw sequence of bytes. There
> should be no endianess conversion,
What should the compiler do if the input narrow charset is not the same as the
execution narrow charset? Worse yet, what if the size of the byte is different?
(cross-compiling to a platform where CHAR_BIT is different from the host
platform where the file is stored)
const auto src = static_text<char>("shader.vert");
const auto wtxt = static_text<wchar_t>("dialogue.msg");
const auto datBlob = static_data<uint8_t>("blob.dat");
const auto datBigBlob = static_data<uint32_t>("big_blob.dat");
// char-flavored dump of "shader.vert", intervening nulls included, stops at first EOF
const char src[] = {static_sequence<char>("shader.vert")};
// Just 0x00, 0x01, 0x02, 0x03, etc... Pass it to a constructor, too!
const uint8_t blob[] = {static_sequence<uint8_t>("blob.dat")};
// 'w','o','r','d','s',' ' ... Including nulls (if any).
const char str_a[] = {static_sequence<char>("words.txt")};
// Stops at first null, or at the first EOF then adds a null instead.
// "words words words" <- null terminated!
const char *str = static_string<char>("words.txt");
template<char ...chars_>
struct charbag{};
fancy_template<
some_type,
charbag<static_sequence<char>("words.txt")...>
> boom{};
//Performs \r\n vs \n conversion
const char a[] =
#include_text "some_file.txt"
;
//Import the raw bytes as a literal
const uint8_t b[] =
#include_bin "some_file.bin"
;
> This kind of feature would be really nice to have. While some people may
> argue its better not to embed data in the binary, sometimes it really just
> makes sense to do it that way. Use cases that come to mind include 3d
> applications with GPU shaders and embedded applications with firmware and
> other binary data blobs. This feature has been available with assemblers
> since ancient history, we should have it in C and C++ as well.
I think this it a problem for build system to prepare such in-code data
blobs using some tool (Qt's qmake and qrc are prominent example of
such). Anyway, you would have to compile your shader code or firmware
using external compiler, so you still need an advanced build system of
some kind in order to get fully automatic build.
> I think such a feature would make the most sense being implemented using
> the processor. Then the implementation can just leverage the cpp include
> path to search for the files.
Of course reusing existing concept is the easiest option, but that
doesn't sound sane to put binary blobs into include path.
Apart of encoding issues, splitting a statement like that looks a bit ugly.
const uint8_t b[] = __INCLUDE_FILE_CONTENTS("some_file.bin");
The above is a bit nicer, but still looks like and ad-hoc solution for
not so common problem.
To sum up, I think that plain binary inclusion is the only one option
which should be considered if this proposal is ever going to be accepted.
On 2015-02-19 17:42, Matthew Fioravante wrote:
> Does anyone have a good reason why this feature does not belong in the
> preprocessor?
You more or less said it; performance. (Also, potentially, ease of
implementation.) Depending on the compiler, it may be much easier to
simply copy the file contents directly from the input file to the
process that's writing the output object file.
As previously stated though I'd be inclined to not legislate this, but
rather specify that the feature behaves "as if" done by the preprocessor
and leave it to the compiler vendors whether or not that's how they
*actually* want to implement it.
> One major issue I can think of regarding differentiating text from binary
> data which is null termination. When we import a text blob, we probably
> would like to have a null terminator added at the end so that the text can
> be used with legacy C api's. GPU shaders in OpenGL would require this.
Hmm... good point. I was sort of assuming the presence of a null
terminator, but you're right that for "pure" binary data (say, image
files) this could be undesirable.
That being the case, this may be a
good way to handle line endings also.
template <typename CharT, size_t N>
constexpr auto remove(string_literal<N, CharT> lit, CharT c) {
auto nc = count(lit.begin(), lit.end(), c);
string_literal<N-nc, CharT> ret;
copy_if(lit.begin(), lit.end(), ret.begin(), [c](auto x) { return x != c; });
return ret;
}
auto norm_text = remove(__INCLUDE_TEXT(char, "some_file.txt"), '\r');
IOW, one 'mode' does line ending translation and null terminates, the
other does neither.
constexpr auto data = __INCLUDE_BIN("some_file.bin");
constexpr auto text = __INCLUDE_TEXT(char,"some_file.txt");
//decltype(text) == const char [/*sizeof file + sizeof(char)*/]
//decltype(data) == const char [/*sizeof file*/] //or maybe unsigned char
constexpr auto data = __INCLUDE_BIN(uint8_t,"some_file.bin");
//file.bin contains the text "file"
__INCLUDE_TEXT("file.txt");
"file"; //<-Produces a string literal
constexpr auto document = "Common Header\n"
__INCLUDE_TXT("section1.txt") "\n"
__INCLUDE_TXT("section2.txt") "\n"
"Common Footer\n";
auto x = "\x00\x0A"b "\x0B\x01"b;
//dectype(x) == char[4];
//x == { 0, 0x0A, 0x0B, 1 };
__INCLUDE_FILE(X"file.txt"Y); //file.txt contains the text "file"
X"\x66\x69\x6C\x65"Y; //Prefix and suffix applied to the resulting string literal
auto fw = __INCLUDE_FILE("firmware.bin"b);
auto fw = "\x01\x00..."b; //<-Macro expands to this
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
const char *str = load_static_string<char>("filename.txt");
const int ints[] = {load_static_sequence<int>("ints.dat")...};
On 2015-02-19 21:54, Matthew Fioravante wrote:
> On Thursday, February 19, 2015 at 7:04:53 PM UTC-5, Matthew Woehlke wrote:
>> On 2015-02-19 17:42, Matthew Fioravante wrote:
>>> One major issue I can think of regarding differentiating text
>>> from binary data which is null termination. When we import a text
>>> blob, we probably would like to have a null terminator added at
>>> the end so that the text can be used with legacy C api's. GPU
>>> shaders in OpenGL would require this.
>>
>> Hmm... good point. I was sort of assuming the presence of a null
>> terminator, but you're right that for "pure" binary data (say, image
>> files) this could be undesirable.
>
> One problem with null termination is that we need to be told the intended
> character type of the data (char, char16_t, char32_t, wchar_t) in order to
> know how many bytes to reserve for the null terminator and the type of the
> resulting string literal expression generated by the macro.
Doesn't this only apply to string literals? If 'pasting' as a list of
char literals ("'h','e','l','l','o',0"), the terminator is just '0', and
the compiler will expand that to fill the element, same as every other
element of the array.
For string literals, it's probably better for the width prefix to be
part of the expansion... although considering this opens the door to
endian conversion questions, my inclination is to only support char
const* literals, at least as a first pass.
>> That being the case, this may be a
>> good way to handle line endings also.
>
> What if the user wants null termination but doesn't want line ending
> processing?
>
> Maybe line endings, endian swapping, [...]
That'd be fine with me. I could also live without line ending
conversion, just saying that if you want to include text data, the
resource file must already have UNIX line endings. (It helps that I hate
Windows :-) and have strictly limited sympathy for its assorted
obnoxious idiosyncrasies.)
> The text routine needs type information to correctly allocate the null.
>
> constexpr auto data = __INCLUDE_BIN("some_file.bin");
...gives a decltype({0xff,0}), i.e. *a std::initializer_list*.
Now... at this point I'm strongly inclined to this, instead:
// std::initializer_list, not terminated
constexpr auto list = {__INCLUDE_LIST("some_file.bin")};
// char[], not terminated
constexpr char[] data = {__INCLUDE_LIST("some_file.bin")};
// char[], terminated :-)
constexpr char[] str = {__INCLUDE_LIST("some_file.bin"), 0};
(This brings up an interesting point; should we state that
__INCLUDE_LIST of an empty file followed by a ',' will remove the ',' a
la MSVC's variadic macros? It seems desirable... or we could just not
support this case where the file is empty.)
> constexpr auto text = __INCLUDE_TEXT(char,"some_file.txt");
...gives a decltype("string literal").
> The binary routine could also use type information which can be useful if
> you want signed of unsigned bytes:
>
> constexpr auto data = __INCLUDE_BIN(uint8_t,"some_file.bin");
No; it should give an initializer_list (or better, token list, as
explained above); the LHS type determines the concrete type. Yes, this
means you can't assign it to 'auto' (unless you *want* the initializer
list), but you *do* want the initializer list to be able to pass it
directly to class constructors.
> __INCLUDE_TEXT should expand to a string literal.
>
> The reason for this is that now we can paste string literals together,
> which comes for free from the behavior of string literals.
Ooh, good point! Bonus! :-)
> The only compatible way to implement __INCLUDE_BIN() is to replace the
> macro with an array initialization.
>
> This would mean you can't paste together multiple __INCLUDE_BIN()
> expressions like you can with __INCLUDE_TEXT().
This would be another reason to have __INCLUDE_LIST instead.
--
Matthew
__INCLUDE_FILE("some_file.bin");
"\x01\x02\x03\x04...."; //<- expands to this, which is null terminated
__INCLUDE_FILE(u"some_file.bin");
u"\x0102\x0304...."; //<- expands to this, which is null terminated
__INCLUDE_FILE(U"some_file.bin");
U"\x01020304...."; //<- expands to this, which is null terminated
__INCLUDE_FILE(L"some_file.bin");
L"\x0102\x0304...."; //<- expands to this (example assuming sizeof(wchar_t) == 16), which is null terminated
template <typename T, size_t N>
constexpr std::array<T,N-1> binary(const char(&lit)[N]) { return { lit.begin(), lit.end()-1 }; }
auto fw = binary(__INCLUDE_FILE("firmware.bin"));
>> (This brings up an interesting point; should we state that
>> __INCLUDE_LIST of an empty file followed by a ',' will remove the ',' a
>> la MSVC's variadic macros? It seems desirable... or we could just not
>> support this case where the file is empty.)
>
> I would treat an empty file like an empty string literal, which means it
> becomes a no-op. So if we used your approach that means getting rid of the
> comma.
Yes, for string literals, it's trivial :-). I would also prefer to
implicitly drop the comma, but I can imagine some people finding that
objectionable.
I think it would be okay if this just produces an error
if the input file is empty; how often is that going to happen, anyway?
(It would have to be a case where you don't know beforehand that the
file will be empty... otherwise why are you loading it?)
> For binary files, we could just use the library again. This would further
> simplify the include macro.
>
> __INCLUDE_FILE("some_file.bin");
> "\x01\x02\x03\x04...."; //<- expands to this, which is null terminated
> __INCLUDE_FILE(u"some_file.bin");
> u"\x0102\x0304...."; //<- expands to this, which is null terminated
> __INCLUDE_FILE(U"some_file.bin");
> U"\x01020304...."; //<- expands to this, which is null terminated
> __INCLUDE_FILE(L"some_file.bin");
> L"\x0102\x0304...."; //<- expands to this (example assuming sizeof(wchar_t)
> == 16), which is null terminated
Above point about getting the simple case right first, what bothers me
about that syntax is that it looks like the file name itself is being
given as a wide string. I would strongly prefer that it be a separate
argument. (In which case I would have some preference for using a type
name rather than a suffix, though I would be okay with either.)
__INCLUDE_FILE("some_file.bin", char32_t);
__INCLUDE_FILE(U, "some_file.bin");
--
Matthew
On 2015–02–18, at 5:54 AM, Thiago Macieira <thi...@macieira.org> wrote:QResource also supports reading a resource directory directly from a file,
instead of something registered inside the binary image (see
QResource::registerResource).
But David's description is more similar to QFileSelector.
On 2015–02–21, at 11:50 AM, David Krauss <pot...@gmail.com> wrote:Is it somehow better to put Herculean effort into achieving these tasks with C++ metaprogramming, instead of using ordinary tools with ordinary toolchains that exist today?