Re: pixel_array data

Andrew Bird

unread,

Jan 21, 2013, 11:42:28 AM1/21/13

to pyd...@googlegroups.com

It is a while since I looked at images. Why are you reading from ds.pixel_array, and then writing to ds.PixelData? On one image I have here, the PixelData and the pixel_array look quite different, so make sure that is what you want to be doing, and that you LPV.tostring() is the same length as whatever data you are wanting to write over.

I will play more later if you are still stuck.
Andrew

On Tuesday, 20 November 2012 15:22:52 UTC, greg2935 wrote:

Hi All,

I am relatively new to python and dicom (and pydicom). I needed something to linearize data in a number of very large dicom files and used the iterative loop shown in the wiki under manipulating pixel data: the problem with this is that it takes over 20mins per file (ds.pixel_array.shape reports 3518 x 2800) and I have over 1000 files to process on a semi regular basis. I have been messing around and thought I had a better idea: code is below:

import dicom, numpy

ds=dicom.read_file(r"c:\1.dcm", force=True)                                                      # read file

LPV = numpy.array(numpy.exp(0.003443*ds.pixel_array), dtype=int)                # convert to linear PV vs Dose and do type conversion to integer

ds.PixelData=LPV.tostring()                                                                               # move data back into dicom object

ds.save_as("afile.dcm")

This saves a dicom file but the data appears completely wrong: a mirror image of the original image is overlayed on the original and the whole thing is shifted down. As I know precious little about python and dicom formats, I am obviously doing something wrong. The data is little endian, 16 bit unsigned, uncompressed. Any ideas anyone? Also, why do you have to change data to PixelData rather than just altering pixel_array with something like ds.pixel_array=ds.pixel_array*3.43 ?

Sorry if my questions are a little silly; I hope you'll have a little patience with me.

Greg

greg2935

unread,

Jan 21, 2013, 4:39:41 PM1/21/13

to pydicom

Hi Andrew,

Many thanks for replying. I am reading from pixel_array and writing
to PixelData because that's what seems to be done in the wiki under
the Working with Pixel Data heading.

I think I'm starting to understand why this is (although its only a
guess). ds.pixel_array and ds.PixelArray prints out the same integer
numpy array. ds.PixelData outputs a string. Why create these three
arrays???

Once these arrays have been created, I have found you cannot write to
them so I assume its immutable. From the numpy.ndarray.flags manual,
"(the) WRITEABLE (flag) can only be set True if the array owns its own
memory or the ultimate owner of the memory exposes a writeable buffer
interface or is a string". I am therefore assuming because numpy
arrays cannot be re-written directly, you have to convert your data to
a string and write it to the string ds.PixelData. Then when you save
the file, its the PixelData string that is used to save your file.

I have to emphasise I am only guessing at the moment and if I'm right,
it would more difficult to reduce the size of a dicom file as I assume
you would have to rewrite some of the headers too, (such as ds.Rows,
ds.Columns). I am still struggling but think I should be able to crack
this soon.

Many thanks for your help.

Greg

Jonathan Suever

unread,

Jan 21, 2013, 5:21:14 PM1/21/13

to pyd...@googlegroups.com

Greg,

First an explanation for all the methods / properties that seem to point to the same data.

PixelData - The raw byte string that is stored in the DICOM file. It is accessible here because all DICOM elements are accessible via their keyword.

PixelArray - Returns a numpy.array instance that is created from the PixelData string. The data is converted to the proper format (uint16, endianness, etc), and the shape of the resulting numpy array is determined using ds.Rows, ds.Columns, and ds.SamplesPerPixel, ds.NumberOfFrames.

pixel_array - This is the new name for PixelArray (since 0.9.3). It was changed to ensure pep8 compliance. I guess PixelArray should actually throw a deprecation warning.

The reason that you cannot write back to pixel_array is because it uses the @property decorator and is thus is readonly. To write back any data, you must do your manipulations and then write back with ds.PixelData = numpy.array.tostring() as you've shown above.

Now I'm not sure exactly why your editing isn't working because similar operations work just fine for me; however, I do see one strange thing in your code. You mention that your data is an unsigned 16 bit integer, yet you specify the datatype of your numpy array as an int but you really want to do the following:

LPV = numpy.array(numpy.exp(0.003443*ds.pixel_array), dtype=numpy.uint16)

A more robust way to do this would be to make sure that the datatype you use matches your existing DICOM's:

original_array = ds.pixel_array

original_dtype = original_array.dtype

LPV = numpy.array(numpy.exp(0.003443 * original_array), dtype=original_dtype)

Additionally, are you reading and writing to a remote device or something? There really is no reason why your processing should be taking 20 minutes per file even with image dimensions like you mention.

-Jonathan

--
You received this message because you are subscribed to the Google Groups "pydicom" group.

To post to this group, send email to pyd...@googlegroups.com.
To unsubscribe from this group, send email to pydicom+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pydicom?hl=en.

greg2935

unread,

Jan 29, 2013, 12:11:44 PM1/29/13

to pyd...@googlegroups.com

Dear All,

Sorry for the late reply, I can only work on this stuff between routine work and so have limited time to devote to it. I've traced the 20min wait per file to version mismatches. Once I updated numpy everything speeded up just fine.

Johnathan,

Sorry, was being inconsistent.

For anyone that might have a use for this stuff, I've pasted my script below: it works perfectly but still needs a little work (exception catching etc,) Still, I've verified that it does indeed linearise my data. Please excuse the copious commented scribbles; I'm still learning

Greg

"""
The purpose of this file is to linearise the data in a directory containing DICOM files.

It will parse the directory,each file in turn and at the end print a list of the avg for a ROI
200 x 200 pixels around the center of the image.

The inverse of the Kodak STP is then multiplied by each element in each DICOM file

and saved

"""

##############################################################
# Modules
##############################################################
import dicom                                            # to read dicom files, also imports numpy/scipy
import glob                                               # glob function, allows pattern matching
import csv                                                 # to write a CSV files at the end
import os
from wxPython.wx import *                    # GUI Support
import matplotlib.pyplot                         # to allow plots
from scipy.optimize import curve_fit    # curve fitting routines
import numpy
import scipy

ROIsize=200 # this is the size of the ROI we want to average
Average_ROI=[0]
# the following list is the Air KERMAs measured for each file
x=[0.29, 0.44, 0.65, 1.02, 1.6, 2.68, 4.14, 6.68, 10.45, 16.98, 27.28, 34.11, 42.45]

'''
#############################################
# Standard STP equation for Kodak CR plates #
#############################################
'''
def model_func(x, a, b):
return (a *numpy.log(x)) + b

application = wxPySimpleApp() # needed when using wx module

'''
##################################
# Dialog to open DICOM Directory #
##################################
'''
print "Opening DICOM Directory"                         # info to stdout
dialog = wxDirDialog ( None, message = 'Pick a directory containing DICOM Files.' )
if dialog.ShowModal() == wxID_OK:
   mypath = dialog.GetPath()
else:
   mypath = None
   print "No directory selected, bye!"
   exit()
dialog.Destroy()                                        # get rid of dialog

DicomFiles = glob.glob(mypath + "\*.dcm") # list of dicom files

print "Reading DICOM Files:"
'''
##########################################################
# Each file is read, and a ROI is measured using ROIsize #
# as a square, the average is then stored as an array    #
##########################################################
'''
for file in DicomFiles:
    ds = dicom.read_file(file, force=True)              # read file
    maxRows = ds.Rows                                   # max rows in image
    maxCols = ds.Columns                                # max cols in image
    startRow = (maxRows/2) - (ROIsize/2)                # start of ROI
    endRow = (maxRows/2) + (ROIsize/2)                 # end of ROI
    startCol = (maxCols/2) - (ROIsize/2)
    endCol = (maxCols/2) + (ROIsize/2)
    n=1                                                 # count for average
    sum=0                                               # zero sum
    for rows in range(startRow,endRow,1):
        for cols in range(startCol,endCol,1):
            sum = sum + ds.pixel_array[rows,cols]       # add all pixel values in ROI
            n=n+1
    Average_ROI.append(float(sum)/(n-1))                # append the average to list Average_ROI
                                                        # n-1 as we start at n=1, have to type case
                                                        # int -> float to get floating point division
    print "average: %.3f" % (float(sum)/(n-1))          # info to stdout
'''
########################################
# Remove the first element as the list #
# is created with # Average_ROI[0] = 0 #
# and sort data to correspond with y   #
########################################
'''
Average_ROI.remove(0)                                   # list created with 0 as 1st element
Average_ROI.sort()                                      # filenames are not in "dose" order-> sort

print "Saving PV against Dose data to CSV file"

'''
#################################
# save Pv vs KERMA data to file #
#################################
'''
matplotlib.pyplot.scatter(x,Average_ROI)
matplotlib.pyplot.show()
csv_out = open(os.path.join(mypath, "Dose_PV_Data.csv"), "wb")
#csv_out = open("DataFile.csv", "wb")                   # open file for writing
mywriter = csv.writer(csv_out)                                 # create csv writer object
for row in zip(x, Average_ROI):                                # write relevant data (zip converts data to
                                                                                      # be saved in rows)
        mywriter.writerow(row)
csv_out.close()                                                          # close file

print "Starting to fit data..."
y = Average_ROI

opt_parms, parm_cov = curve_fit(model_func, x, y, maxfev=1000)
a,b = opt_parms
print a,b
print x
print y
print model_func(x, a,b)
matplotlib.pyplot.scatter(x,y)
matplotlib.pyplot.plot(x,model_func(x,a,b))
matplotlib.pyplot.show()
'''
##########################################################
# We now need to linearise the data for each file in the #
# dicom directory mypath. A new directory must then be   #
# created in the mypath directory and the linear DICOM   #
# files saved here                                       #
##########################################################
'''
print "Opening DICOM Directory to save LINEAR files"                         # info to stdout
dialog = wxDirDialog ( None, message = 'Pick a directory to save linear DICOM Files.' )
if dialog.ShowModal() == wxID_OK:
   mysavepath = dialog.GetPath()
else:
   mysavepath = "c:"
   print "Data being saved in c:\tmp"
dialog.Destroy()

n=1                                                     # this number used for the saved filename
for file in DicomFiles:
    print file                                          # info to stdout
    ds = dicom.read_file(file, force=True)              # read file
    slope = a
    alpha = 1.0/slope
    fs = 4000.0/a                                       # needs to be slope wanted/actual slope
    print 'The value of a is ' + repr(a) + ', and fs is ' + repr(fs)

    LPV=fs*numpy.exp(ds.pixel_array*alpha)              # convert array to Linear PV
    LPV_int = numpy.uint16(numpy.rint(LPV))             # round numbers and convert to
                                                        # unsigned int 16 bit
                                                        # convert to raw bytes
    ds.PixelData = LPV_int.tostring()                   # write pixel data back to file object
    save_file = mysavepath + `n` + ".dcm"
    print save_file
    ds.save_as(save_file)
    n=n+1

p=raw_input("Press return to exit: ")

exit()

greg2935

unread,

Jan 29, 2013, 12:13:26 PM1/29/13

to pyd...@googlegroups.com

Also wanted to thanks everyone who helped, its much appreciated.

Greg

Bogdan Dzyubak

unread,

Jul 13, 2018, 4:26:14 PM7/13/18

to pydicom

Hi, Jonathan

Thank you for clarifying the significance of PixelData vs pixel_array. I had just understood that myself, but am having another issue. Was hoping you or someone else could help out.

I am trying to save images in dicom using a hardcoded header (to be consistent with our standard inputs which are in proper dicom). I think I am most of the way there but even despite specifying Rows, Columns, PixelRepresentation, and NumberOfFrames, the pixel_array tries to read it as the wrong size:

        ds.SamplesPerPixel = 1
        ds.PhotometricInterpretation = "MONOCHROME2"
        ds.PixelRepresentation = 0
        ds.HighBit = 15
        ds.BitsStored = 16
        ds.BitsAllocated = 16
        ds.Columns = Image.shape[0]
        ds.Rows = Image.shape[1]
        # if Image.dtype != np.uint16:

        ds.PixelRepresentation = 1 #
        ds.NumberOfFrames = 1

Image = (Image).astype(dtype="int16")

ds.PixelData = (Image).tostring()

cannot reshape array of size 16384 into shape (64,64)

Image is 64 by 64. I think there's a datatype issue, i.e. it sees 64x64x8 (bigger datatype). When I leave the image as float64 and don't convert it to int16, I get:

cannot reshape array of size 65536 into shape (64,64)

Many thanks,

Bogdan

Darcy Mason

unread,

Jul 14, 2018, 5:56:30 PM7/14/18

to pydicom

On Friday, July 13, 2018 at 4:26:14 PM UTC-4, Bogdan Dzyubak wrote:

...

I am trying to save images in dicom using a hardcoded header (to be consistent with our standard inputs which are in proper dicom). I think I am most of the way there but even despite specifying Rows, Columns, PixelRepresentation, and NumberOfFrames, the pixel_array tries to read it as the wrong size:

...

cannot reshape array of size 16384 into shape (64,64)

This looks like it is really a problem in the numpy array and conversions, not in pydicom code per se. 64*64 (*2 for int16) = 8192, half the number of bytes the array has.

Image is 64 by 64. I think there's a datatype issue, i.e. it sees 64x64x8 (bigger datatype). When I leave the image as float64 and don't convert it to int16, I get:

cannot reshape array of size 65536 into shape (64,64)

Again the math doesn't match: 65536/64/64 = 16 bytes, but you said you left it as float64 (8 bytes).

Are you sure your array is actually 64x64 (and two-dimensional)?

Perhaps if you post the full error traceback we can tell better where the error occurs. The code shown reads the shape to set the dicom elements, but I don't see where the reshape is happening.

Darcy

Reply all

Reply to author

Forward