Buffered reading of BLOB from postgres database

50 views
Skip to first unread message

Sverre Aleksandersen

unread,
Nov 29, 2023, 12:30:18 PM11/29/23
to jOOQ User Group
Hi,

I have a use case where I store and retrieve large blobs (1 MB - 200 MB) from a postgres database.

On the input side I've managed to stream spring boot MultipartFiles from a REST interface all the way into the DB using the following code:
public class StreamingBlobBinding implements Binding<byte[], InputStream> {
@Override
public Converter<byte[], InputStream> converter() {
return new Converter<>() {

@Override
public InputStream from(byte[] bytes) {
return new ByteArrayInputStream(bytes);
}

@Override
public byte[] to(InputStream inputStream) {
try {
return IOUtils.toByteArray(inputStream);
} catch (IOException e) {
throw new RuntimeException(e);
}
}

@Override
public Class<byte[]> fromType() {
return byte[].class;
}

@Override
public Class<InputStream> toType() {
return InputStream.class;
}
};
}

@Override
public void sql(BindingSQLContext<InputStream> ctx) {
if (ctx.render().paramType() == ParamType.INLINED) {
ctx.render().visit(DSL.inline(ctx.convert(converter()).value()));
} else {
ctx.render().sql(ctx.variable());
}
}

@Override
public void register(BindingRegisterContext<InputStream> ctx) throws SQLException {
ctx.statement().registerOutParameter(ctx.index(), Types.BLOB);
}

@Override
public void set(BindingSetStatementContext<InputStream> ctx) throws SQLException {
// This line is responsible for the streaming
ctx.statement().setBinaryStream(ctx.index(), ctx.value());
}

@Override
public void set(BindingSetSQLOutputContext<InputStream> bindingSetSQLOutputContext) throws SQLException {
throw new SQLFeatureNotSupportedException();
}

@Override
public void get(BindingGetResultSetContext<InputStream> bindingGetResultSetContext) throws SQLException {
throw new SQLFeatureNotSupportedException();
}

@Override
public void get(BindingGetStatementContext<InputStream> bindingGetStatementContext) throws SQLException {
throw new SQLFeatureNotSupportedException();
}

@Override
public void get(BindingGetSQLInputContext<InputStream> bindingGetSQLInputContext) throws SQLException {
throw new SQLFeatureNotSupportedException();
}
}

and

public static long saveFileData(MultipartFile file) throws IOException {
final var id = uow.ctx.nextval(FILE_DATA_SEQ);
var inputStreamType = FILE_DATA.DATA.getDataType().asConvertedDataType(new StreamingBlobBinding());
var dataColumn = DSL.field(FILE_DATA.DATA.getName(), inputStreamType);
var result = uow.ctx.insertInto(FILE_DATA)
.columns(FILE_DATA.ID, dataColumn)
.values(DSL.val(id, FILE_DATA.ID.getDataType()),
DSL.val(file.getInputStream(), inputStreamType))
.execute();

if (result != 1) {
throw new DataAccessException("Unable to store filedata");
}

return id;
}


I'm having more difficulty getting it to work the other way around, i.e. having Jooq fetch a BLOB from postgres and storing the result using a OutputStream. Do anyone have any tips on how I can achieve this?

Best regards,
Sverre Aleksandersen

tod...@googlemail.com

unread,
Nov 29, 2023, 3:48:43 PM11/29/23
to jOOQ User Group
Hi Sverre,

in my apps (Spring Boot + Vaadin + JOOQ) I also have to store files uploaded by the user.
If I do this in the Postgresql, which is BTW not everytime a good solution, I usually split document meta data (filename, MIME type, size etc.) from document content (the byte array only).
Often, the UI shows some document meta data in the first place and storing and retrieving has to be therefore pretty easy,.
And my method for storing the maybe large content looks like this, and surprisingly the comment contains a link to this group WHY it look so :-)

/**
     * @see https://groups.google.com/g/jooq-user/c/Gwn0rce_J-Q Daher wird das hier
     */
    private void storeContent(Integer id, byte[] content) {

        if (this.jooq.configuration().connectionProvider() instanceof DataSourceConnectionProvider dscp) {
            var dataSource = dscp.dataSource();
            try (Connection conn = dataSource.getConnection();
                 ByteArrayInputStream is = new ByteArrayInputStream(content);
                 PreparedStatement ps = conn.prepareStatement("insert into document_storage (id, content) values (?, ?)")) {
                ps.setInt(1, id);
                ps.setBinaryStream(2, is, content.length);
                ps.executeUpdate();
            } catch (Exception ex) {
                log.error(ex.getMessage());
            }
        }
    }

Hope that helps

Kind regards
Dominik

Lukas Eder

unread,
Nov 30, 2023, 2:30:29 AM11/30/23
to jooq...@googlegroups.com
Folks, have you had a look at what pgjdbc does with your InputStream or OutputStream? Hint: spare yourself the premature optimisation. Just use byte[] :)

Bind values:

Results:

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jooq-user/6a8de7e2-bd55-4033-86fd-0c9bc943a592n%40googlegroups.com.

Sverre Aleksandersen

unread,
Nov 30, 2023, 2:41:55 AM11/30/23
to jOOQ User Group
I'm running the app in Kubernetes and get OOMKilled because memory consumption goes through the roof on high throughput, so its not premature optimisation.
I want to be able to read the blobs using a buffer so that my container does not get killed. I see two solutions:  (a) read blob using a buffer and write it to disk or (b) read blob using a buffer and stream out of app using HTTP. I'm currently working on solution (a) since this is simplest and will probably be sufficient.

If I interpret the code in PgResultSet correctly it seems like it reads the entire BLOB into memory irregardless?

Regards,
Sverre

Lukas Eder

unread,
Nov 30, 2023, 2:56:30 AM11/30/23
to jooq...@googlegroups.com
On Thu, Nov 30, 2023 at 8:41 AM Sverre Aleksandersen <sverre.ale...@gmail.com> wrote:
I'm running the app in Kubernetes and get OOMKilled because memory consumption goes through the roof on high throughput, so its not premature optimisation.
I want to be able to read the blobs using a buffer so that my container does not get killed. I see two solutions:  (a) read blob using a buffer and write it to disk or (b) read blob using a buffer and stream out of app using HTTP. I'm currently working on solution (a) since this is simplest and will probably be sufficient.

If I interpret the code in PgResultSet correctly it seems like it reads the entire BLOB into memory irregardless?

Sorry for the confusion, that's what I meant by "premature optimisation." It doesn't work this way with pgjdbc. They didn't implement the standard JDBC APIs as one would expect. 

Here's a resource on alternative pgjdbc specific API that allows for accessing BLOBs:

I guess you could just pass around pgjdbc's LargeObject (which works like java.sql.Blob without implementing it...) and use that to stream your data both ways.


Reply all
Reply to author
Forward
0 new messages