Hi,
> I've seen this come up a couple of times in past threads, and was
> hoping it might have been addressed in v1.3, but it appears there is
> no multi-threading done in OpenJPEG.
>
>
> I tried a simple attempt at multithreading the decoding of individual
> components within tcd.c, which at first appearance seemed to make a
> marked improvement in decode speeds (from 1/sec per 2k tile (4
> components = 0.25sec/comp/tile) to 0.1/sec/comp/tile (about 60%
> improvement). However, I'm guessing that the rest of the library
> isn't written as thread-safe, because it quickly segfaults after
> decoding the first tile, usually from bad pointers, etc. I'm no
> expert on pthreads, or threading C programs, and I quickly found
> myself out of my league.
In the 2.0 alpha, it should be (fairly) easy to add support for
multithreading.
First, implement a IO(input/output) thread that reads/write data on
stream.
Then follow the "normal" initialization phase.
Then copy the tcd_t structure and pass it to a thread that will
perform the tcd_(en|de)code_tile.
Only parralelize the tcd_decode_tile/ tcd_encode_tile part of the
(de)coding
Warning since the coding parameter (struct cp_t) and the image
(opj_image_t) will be shared among all the threads and may crash on
concurrent IO.
It may be safer to copy also these elements but I am not sure (to be
tested).
The "clean" (but slower in terms of coding) way would be
1 to add the number of components of the image in the cp_t parameters
and to get rid of any pointer to the image struct in decode and
encode.
2 Store the number of decoded resolutions in the tcd_t struct and not
in the image.
3 I highly doubt the sytem will crash on concurrent IO on the cp_t
struct (to be tested)
To sum up :
1 read/write and decode/encode headers by an input thread. Add the
resulting tcd_t struct to a pool of pending tiles. If the pool is full
(filled with maximum memory, hold the input stream).
2 multiple threads ( p ) are waiting on the pool and process only
tcd_(en|de)code_tiles. The resulting encoded/decoded data is indexed
by the number of the tile and included in a second pool. An output
stream is waiting on this pool and deliver data to the client (file,
decoding application, ...) in order.
Before multithreading the encoding part of the library, I advice to
work with floats and get rid of the fixed point operations. This will
give a large boost to the encoding part.
> Anyone know if a mem-buffer
> implementation will be done for 2.0?
For example :
typedef struct my_opj_memory
{
OPJ_UINT32 m_total_size; /* size of the buffer */
OPJ_UINT32 m_current_offset; /* position in the buffer */
OPJ_BYTE * m_buffer; /* buffer */
} my_opj_memory_t;
OPJ_UINT32 opj_read_from_memory (void * p_buffer, OPJ_UINT32
p_nb_bytes, my_opj_memory_t * p_data)
{
OPJ_UINT32 l_remain = p_data->m_total_size - p_data-
>m_current_offset;
l_remain = uint_min(l_remain,p_nb_bytes);
memcpy(p_buffer,p_data->m_buffer+p_data->m_current_offset,l_remain);
p_data->m_current_offset += l_remain;
return l_remain ? l_remain : -1;
}
OPJ_UINT32 opj_write_to_memory (void * p_buffer, OPJ_UINT32
p_nb_bytes, my_opj_memory_t * p_data)
{
OPJ_UINT32 l_remain = p_data->m_total_size - p_data-
>m_current_offset;
l_remain = uint_min(l_remain,p_nb_bytes);
memcpy(p_data->m_buffer+p_data->m_current_offset,p_buffer,l_remain);
p_data->m_current_offset += l_remain;
return l_remain ? l_remain : -1;
}
OPJ_SIZE_T opj_skip_from_memory (OPJ_SIZE_T p_nb_bytes,
my_opj_memory_t * p_data)
{
OPJ_UINT32 l_remain = p_data->m_total_size - p_data-
>m_current_offset;
l_remain = uint_min(l_remain,p_nb_bytes);
p_data->m_current_offset += l_remain;
return l_remain ? l_remain : -1;
}
OPJ_BOOL opj_seek_from_memory (OPJ_SIZE_T p_nb_bytes, my_opj_memory_t
* p_user_data)
{
if
(p_nb_bytes > p_data->m_total_size)
{
return 0;
}
p_data->m_current_offset = p_nb_bytes;
return 1;
}
opj_stream_t* OPJ_CALLCONV opj_stream_create_memory_stream
(my_opj_memory_t * p_data,OPJ_UINT32 p_size,OPJ_BOOL p_is_read_stream)
{
opj_stream_t* l_stream = 00;
if
(! p_file)
{
return 00;
}
l_stream = opj_stream_create(p_size,p_is_read_stream);
if
(! l_stream)
{
return 00;
}
opj_stream_set_user_data(l_stream,p_data);
opj_stream_set_read_function(l_stream,(opj_stream_read_fn)
opj_read_from_memory);
opj_stream_set_write_function(l_stream, (opj_stream_write_fn)
opj_write_to_memory);
opj_stream_set_skip_function(l_stream, (opj_stream_skip_fn)
opj_skip_from_memory);
opj_stream_set_seek_function(l_stream, (opj_stream_seek_fn)
opj_seek_from_memory);
return l_stream;
}
That should do the trick.
Just remains to create a my_opj_memory_t struct and allocate a buffer.
Hope this helps,
Jérôme