x/image/font: Font serialization

276 vistas
Ir al primer mensaje no leído

Sebastien Binet

no leída,
4 feb 2021, 1:02:29 p. m.4/2/21
para golan...@googlegroups.com
hi there,

Right now, I am pretty happy with the state of the
x/image/font{,/sfnt,/opentype}} packages. I can load TTF/OTF files, draw
some glyphs in a way that (almost) resembles LaTeX[1].

Great. (and many thanks, by the way.)

We migrated gonum/plot[2] from freetype to x/image/font recently and all
our use cases worked well (as far as I know.)

Well, all, save for one: being able to embed fonts into PDFs.

To some extent, embedding fonts into PDF files needs to re-serialize
sfnt.Font back into a []byte following the OTF format.

Unless I am mistaken, x/image/font doesn't seem to provide the reverse
function of sfnt.Parse.
(This is a bit annoying because one needs to provide a way to associate
a given font.Face or sfnt.Font with its original []byte raw data.)

Is it something that x/image/font should or would provide?

[1]: https://github.com/go-latex/latex/tree/main/cmd/mtex-render
[2]: https://gonum.org/v1/plot

Robert Engels

no leída,
4 feb 2021, 2:28:47 p. m.4/2/21
para Sebastien Binet,golan...@googlegroups.com
If you have the data to pass to Parse then you have the data to embed the font in the pdf.

> On Feb 4, 2021, at 12:02 PM, Sebastien Binet <s...@sbinet.org> wrote:
>
> hi there,
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/mDlexRfwWwESA2w9pIRqq4SD72RMobdQ109RV5g0uQ%40cp3-web-020.plabs.ch.

Sebastien Binet

no leída,
4 feb 2021, 3:00:24 p. m.4/2/21
para Robert Engels,golan...@googlegroups.com
yes.
but as I wrote in the OP, it's not completely satisfying.
one needs to keep track of the association font.Face/[]byte.
so that's either double the memory (give or take), or a filename/io.Reader handle to keep around.

-s

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Thursday, February 4th, 2021 at 8:28 PM, Robert Engels <ren...@ix.netcom.com> wrote:

> If you have the data to pass to Parse then you have the data to embed the font in the pdf.
>
> > On Feb 4, 2021, at 12:02 PM, Sebastien Binet s...@sbinet.org wrote:
> >
> > hi there,
> >
> > Right now, I am pretty happy with the state of the
> >
> > x/image/font{,/sfnt,/opentype}} packages. I can load TTF/OTF files, draw
> >
> > some glyphs in a way that (almost) resembles LaTeX1.
> >
> > Great. (and many thanks, by the way.)
> >
> > We migrated gonum/plot2 from freetype to x/image/font recently and all
> >
> > our use cases worked well (as far as I know.)
> >
> > Well, all, save for one: being able to embed fonts into PDFs.
> >
> > To some extent, embedding fonts into PDF files needs to re-serialize
> >
> > sfnt.Font back into a []byte following the OTF format.
> >
> > Unless I am mistaken, x/image/font doesn't seem to provide the reverse
> >
> > function of sfnt.Parse.
> >
> > (This is a bit annoying because one needs to provide a way to associate
> >
> > a given font.Face or sfnt.Font with its original []byte raw data.)
> >
> > Is it something that x/image/font should or would provide?
> >

Robert Engels

no leída,
4 feb 2021, 6:47:40 p. m.4/2/21
para Sebastien Binet,golan...@googlegroups.com
I think you want to include the original font data. When you parse the font in Go it only needs the hints/fidelity for the Go renderer. When you create the pdf you want to have the full font for optimum rendering.

> On Feb 4, 2021, at 2:00 PM, Sebastien Binet <s...@sbinet.org> wrote:
>
> yes.

Nigel Tao

no leída,
4 feb 2021, 8:11:03 p. m.4/2/21
para Sebastien Binet,Robert Engels,golang-nuts
On Fri, Feb 5, 2021 at 7:00 AM Sebastien Binet <s...@sbinet.org> wrote:
but as I wrote in the OP, it's not completely satisfying.
one needs to keep track of the association font.Face/[]byte.
so that's either double the memory (give or take), or a filename/io.Reader handle to keep around.

Two []byte values aren't double the memory if the slices share the same backing array.

If you're passing a file-backed io.ReaderAt to sfnt.ParseReaderAt, you're going to have to keep the file open anyway, and having multiple references to the same io.ReaderAt similarly all share the same file descriptor. You're probably also going to have to track the underlying *os.File separately anyway, in order to Close it when you're done (or else you'd leak it).

What's your proposed API?

Sebastien Binet

no leída,
5 feb 2021, 4:57:25 a. m.5/2/21
para Nigel Tao,Robert Engels,golang-nuts
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 5th, 2021 at 2:10 AM, Nigel Tao <nige...@golang.org> wrote:

> On Fri, Feb 5, 2021 at 7:00 AM Sebastien Binet <s...@sbinet.org> wrote:
>
> > but as I wrote in the OP, it's not completely satisfying.
> > one needs to keep track of the association font.Face/[]byte.
> > so that's either double the memory (give or take), or a filename/io.Reader handle to keep around.
>
> Two []byte values aren't double the memory if the slices share the same backing array.

sure, I was thinking more of the memory taken up by the []byte and its "equivalent" as a sfnt.Font.

>
> If you're passing a file-backed io.ReaderAt to sfnt.ParseReaderAt, you're going to have to keep the file open anyway, and having multiple references to the same io.ReaderAt similarly all share the same file descriptor. You're probably also going to have to track the underlying *os.File separately anyway, in order to Close it when you're done (or else you'd leak it).
>
> What's your proposed API?

package sfnt

// Marshal returns the OTF encoding of f.
func Marshal(f Font) ([]byte, error)
func MarshalWriter(w io.Writer, f Font) error

(for a lack of a better naming mirroring the Parse/ParseReaderAt function)

-s

Nigel Tao

no leída,
5 feb 2021, 5:19:00 p. m.5/2/21
para Sebastien Binet,Robert Engels,golang-nuts
On Fri, Feb 5, 2021 at 8:56 PM Sebastien Binet <s...@sbinet.org> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 5th, 2021 at 2:10 AM, Nigel Tao <nige...@golang.org> wrote:
> Two []byte values aren't double the memory if the slices share the same backing array.

sure, I was thinking more of the memory taken up by the []byte and its "equivalent" as a sfnt.Font.

Ah. A sfnt.Font isn't really an "equivalent". It's more like an in-memory cache or index of small but frequently-used parts of the underlying []byte (or io.ReaderAt), but it's not comprehensive. It doesn't have any in-memory representation of the not-frequently-used parts, including the actual glyph vectors. You could only marshal it to a complete TTF/OTF if you had the original bytes lying around too. But if you have that, you don't need to marshal anything.
 

> What's your proposed API?

package sfnt

// Marshal returns the OTF encoding of f.
func Marshal(f Font) ([]byte, error)
func MarshalWriter(w io.Writer, f Font) error

I suppose we could rename "type source" to "type Source" and have:

// Source returns the []byte or io.ReaderAt passed to Parse or ParseReaderAt.
func (f *Font) Source() Source

or maybe:

// Source returns the []byte or io.ReaderAt passed to Parse or ParseReaderAt.
//
// fileLength is the largest file offset referred to by f's tables. An
// io.ReaderAt doesn't necessarily know its own 'file length'.
func (f *Font) Source() (s Source, fileLength int64)

Tangentially, using a TTF/OTF font needs random access to the underlying data, unlike e.g. decoding a JPEG using a 'one and done' sequential read. Package sfnt was designed to work with either a []byte or an io.ReaderAt, but the code paths are more complicated for io.ReaderAt. I'm curious if anyone actually uses the io.ReaderAt support or whether, in hindsight, it was unnecessary complexity. For example, on many systems it's possible to mmap a file as a []byte, instead of going through an *os.File, but I don't have a good sense if "on many systems" is "on all systems (in practice)"...

Nigel Tao

no leída,
5 feb 2021, 6:08:19 p. m.5/2/21
para Sebastien Binet,Robert Engels,golang-nuts
On Sat, Feb 6, 2021 at 9:18 AM Nigel Tao <nige...@golang.org> wrote:
It doesn't have any in-memory representation of the not-frequently-used parts, including the actual glyph vectors.

Correction: it doesn't have not-frequently-used or too-big-to-cache parts, and glyph vectors are the latter. Avoiding "takes double the memory" is precisely the concern.

Robert Engels

no leída,
5 feb 2021, 7:56:22 p. m.5/2/21
para Nigel Tao,Sebastien Binet,golang-nuts
If you don’t write the font “exactly” you might run into copyright/TOS problems. 

In most cases though, you are better off using standard fonts and using the correct name in the pdf - and let the viewer find/replace the font - you will not have a problem in that case.

On Feb 5, 2021, at 5:08 PM, Nigel Tao <nige...@golang.org> wrote:



Sebastien Binet

no leída,
8 feb 2021, 4:41:16 a. m.8/2/21
para Nigel Tao,Robert Engels,golang-nuts
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Friday, February 5th, 2021 at 11:18 PM, Nigel Tao <nige...@golang.org> wrote:
[...]

> > > What's your proposed API?
> >
> > package sfnt
> >
> > // Marshal returns the OTF encoding of f.
> >
> > func Marshal(f Font) ([]byte, error)
> > func MarshalWriter(w io.Writer, f Font) error
>
> I suppose we could rename "type source" to "type Source" and have:
>
> // Source returns the []byte or io.ReaderAt passed to Parse or ParseReaderAt.
> func (f *Font) Source() Source

SGTM, and make Source implement io.Reader or io.WriterTo.

or directly return an io.Reader (either from the underlying []byte, or as an io.SectionReader - wrapping the io.ReaderAt+size) ?

>
> or maybe:
> // Source returns the []byte or io.ReaderAt passed to Parse or ParseReaderAt.
> //
> // fileLength is the largest file offset referred to by f's tables. An
> // io.ReaderAt doesn't necessarily know its own 'file length'.
> func (f *Font) Source() (s Source, fileLength int64)
>
> Tangentially, using a TTF/OTF font needs random access to the underlying data, unlike e.g. decoding a JPEG using a 'one and done' sequential read. Package sfnt was designed to work with either a []byte or an io.ReaderAt, but the code paths are more complicated for io.ReaderAt. I'm curious if anyone actually uses the io.ReaderAt support or whether, in hindsight, it was unnecessary complexity. For example, on many systems it's possible to mmap a file as a []byte, instead of going through an *os.File, but I don't have a good sense if "on many systems" is "on all systems (in practice)"...

I think Brad may disagree on the availablity of mmap on "all systems" :)

looking at some of my uses of sfnt.ParseXYZ, I indeed get more []byte uses than io.ReaderAt ones, but that's mainly because I always provide a way to package fonts like goregular does (ie: w/ a []byte).
with the advent of io/fs, the stat count may well reverse.

Nigel Tao

no leída,
10 feb 2021, 7:56:26 p. m.10/2/21
para Sebastien Binet,Robert Engels,golang-nuts
On Mon, Feb 8, 2021 at 8:40 PM Sebastien Binet <s...@sbinet.org> wrote:
SGTM, and make Source implement io.Reader or io.WriterTo.

or directly return an io.Reader (either from the underlying []byte, or as an io.SectionReader - wrapping the io.ReaderAt+size) ?

On further thought, since a Source (and its []byte or io.ReaderAt) are stateless (safe to use concurrently) but io.Reader and io.WriterTo are stateful (e.g. the file's current position), I'll go with your original suggestion for a Font method that takes an io.Writer.

Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos