compustat segments data

5,883 views
Skip to first unread message

J Smith

unread,
Apr 29, 2016, 6:09:38 AM4/29/16
to wrdssas
Hello all...and Joost 

First off, thanks to you Joost for maintaining this community and for the advice you provide, which has already been helpful to me as I look back over previous posts. 

While I read a bit about the compustat segments data in previous questions you have answered, I am hoping to clarify a few points. 


(1) What are the differences between operating segment, business segment, and geographic segment? It would seem that business segment and operating segments are 'type' identifiers that essentially function as the same. However, I need clarification. 


(2) On that point... when I set the source date equal to the data date (dropping all other observations), the business, operating, and geographic segments will add up to TWICE the firm-level's sales. And when I remove geographic segments, the business and operating segments will usually add up to the firm-level's sales. This leads me to believe that one or the other must be selected (either geographic or business AND operating). Additionally, the geographic segments dont seem to have corresponding primary or secondary SIC codes...why is this? 


(3) Even further, what is the point of the geographic segments then? It does not appear to give information about WHERE the firm is operating, unless I am mistaken (please correct me if so). 


(4) How do I properly count the number of segments that a firm has? Do I set the source date equal to data date and then also drop the observations that dont report a primary or secondary SIC code? (it would appear that this effectively eliminates the geographic segments). OR...do I set the source date equal to the data date and also count the geographic segments, even if it would be double counting..? Looking for clarification here also. 




I think that covers it. Again, thanks for your help, Joost or to anyone else who may read this and reply. 

Kind regards,
J

joost impink

unread,
Apr 29, 2016, 4:18:23 PM4/29/16
to wrdssas
hi J.,

Good to hear the website/forum have been helpful!

About your questions. For (1) -- I'm not sure there is any difference between segment types 'business' and 'operational' (it may be a timing thing before/after SFAS 131), but I normally treat these as the same, i.e., as industrial segments, where most likely the SIC codes differ (one firm doing multiple things). Geographical segments are another dimension, and firms may provide both segment breakdowns (both by activity, and by location). In principle, you can have a firm that has diversified industrial activities being active in one area. In that case, a geo segment breakdown would show a single segment. In the other extreme, you can have an oil drilling company being active in many places, but with all activity being the same.

Indeed, for (2), if you break down some type of measure (sales, net income, or assets) on two dimensions (industry code and geographic location) and you add them all up, you get twice the numbers. Compustat/S&P fills the segment dataset based on the 10-K segment information footnote. Based on how firms describe their industrial segments the 'data entry' workers get some detail to decide what the best fitting SIC should be. For geo segments they have nothing to go on, so that is left blank.

(3) geo segments should give details of location; if you browse through wrds_segmerged, and look at variable 'snms' you'll find 'Europe', 'United States', etc for geo segments (but, it is also often blank for geo segments).

(4) I have a piece of code online that computes the number of geo segments: https://gist.github.com/joosti/213050de42d6e78f1634#file-comp_segment_industrial_segment_count-sas It should give you some starting point to count the number of industrial segments. There are different ways to treat industrial segments that have the same 4-digit SIC codes though (some papers combine these, some don't). By the way, there is considerable measurement error/noise in the segment files. For many firms no segment data is available. Also, there seem to be inconsistencies. For many 1-segment firm, the SIC code of that one segment does not match the SIC code in Compustat, sum-of-segment data does not always add up, etc.

Hope this helps!

Best Regards,

Joost

J Smith

unread,
Apr 30, 2016, 8:38:48 PM4/30/16
to wrdssas
Joost, thanks for your timely reply. 

I have a follow-up question. 

(1) When counting segments, it seems that I would have to set source date = data date, remove segments that dont list any SIC, and then count --- is that correct? If I counted operation/business AND geographic, it seems as though I would end up with a larger segment count that there really is... 

(2) Do you recommend a few papers that actually recount the logistics of either counting segments with only different SICs or just counting each segment that has an SIC period... 


Thanks, 
J

joost impink

unread,
May 1, 2016, 10:58:08 AM5/1/16
to wrdssas
Hi,

Filtering on datadate to be equal to source data will remove doubles. The segment data for a year (say, ending dec 31 2014) will appear in three fiscal years. The fiscal year ending dec 31 2014, and also the next 2 years where the 2014 data will be included as previous year data.

Some papers that use segment data (didn't check which ones are most relevant):

Bens, Daniel A., Philip G. Berger, and Steven J. Monahan, 2011, Discretionary disclosure
in financial reporting: An examination comparing internal firm data to externally
reported segment data, The Accounting Review 86, 417–449.

Berger, Philip G., and Rebecca N. Hann, 2003, The impact of SFAS No. 131 on information
and monitoring, Journal of Accounting Research 41, 163–223.

Bushman, R., Raffi J. Indjejikian, and Abbie Smith, 1995, Aggregate performance measures
in business unit manager compensation: The role of intrafirm interdependencies,
Journal of Accounting Research 33, 101–128.

Ettredge, Michael, Soo Young Kwon, and David Smith, 2002, Competitive harm and
companies’ positions on SFAS No. 131, Journal of Accounting, Auditing & Finance
NS17, 93–109.

Givoly, Dan, Carla Hayn, and Julia D’Souza, 1999, Measurement errors and information
content of segment reporting, Review of Accounting Studies 4, 15–43.

Harris, Mary S., 1998, The association between competition and managers’ business
segment reporting decisions, Journal of Accounting Research 36, 111–128.

Hope this helps,

Joost

J Smith

unread,
May 1, 2016, 4:19:15 PM5/1/16
to wrdssas
Thank you, Joost. Much appreciated. 

J Smith

unread,
May 3, 2016, 8:19:52 AM5/3/16
to wrdssas
Joost or anyone else,

I think I have one more question. 


Instead of filtering on datadate to be equal to source date, would it be acceptable to filter on datadate equal to source date +2years. That way potentially ensuring the most accurate up to date information for each segment observation? 

Thanks,
Rich 



On Sunday, May 1, 2016 at 10:58:08 AM UTC-4, joost impink wrote:

joost impink

unread,
May 3, 2016, 8:26:02 AM5/3/16
to wrdssas
hi Rich,

I suppose I guess it depends on what you want.

If over time the firm decides to change their reported segments (restructuring), then with this approach you will get segment info that may have not been the actual segments for that year. Another reason to keep datadate=srcdate is that it gives the data that is available for investors at that point in time. But this may not be relevant for your case.

Best Regards,

Joost

J Smith

unread,
May 3, 2016, 3:45:20 PM5/3/16
to wrdssas
Thanks again for your help - 

So that's interesting - I would think that grabbing the restatement (e.g., two years later) would be ensuring that I would getting what the actual segments were for that year. Meaning that the firm could of potentially 'misstated' when they originally stated their segment data within the particular year. 

For example, general electric formed an alliance in 2007, and the alliance data from sdc lists general electric participant SIC code as '3612.' However, for that year, the segments data or any reported SIC codes for GE make no reference to this 3612 SIC code. On the other hand, when I look at the restatement of the GE segments 2 years later, voila, the 3612 SIC code is there. 


Thanks

joost impink

unread,
May 5, 2016, 9:28:35 PM5/5/16
to wrdssas
There is another explanation other than 'misstatements' for the example you give. Following the segment disclosure standard (SFAS 131 if I'm not mistaken) firms need to disclose operating segments that are in line as how the firm has organized their operations. If it takes some time before a new acquisition is integrated, then it is perfectly in line with the standard if it doesn't show up as a segment in the year of acquisition.

Best Regards,

Joost

Vasili

unread,
May 12, 2016, 10:54:49 AM5/12/16
to wrdssas
Hi Joost,

 What I am trying to do is to separate the whole COMPUSTAT database into firms that have a large portion of their total sales outside the U.S., and firms that mostly make sales in the U.S for year 2006. The Data Date vs. Source Date issue is causing me a serious confusion, as I do not know what date is the appropriate to use for my data filtering. 
 Could you please briefly explain me what is the difference between those two dates? Which one would you recommend for the purpose that I have?
 Looking forward to your reply,

Best regards,
Vasili 

joost impink

unread,
May 12, 2016, 11:07:16 AM5/12/16
to wrdssas
hi Vasili,

It is probably a good idea to take a few 10-Ks and compare the segment disclosure with the data in Compustat Segments.

srcdate and datadate are the same for the annual report of that year; so if fiscal year ends at Dec 31, 2015, then srcdate and datadate will be Dec 31, 2015. The next annual report for fiscal 2016 will also include segment info for 2015 (as a benchmark for the 2016 segment data). In that case the srcdate will be Dec 31, 2016, and the datadate will still be Dec 31, 2015. Unless the firm changed their segments, the same data will be repeated.

Filtering srcdate to equal datadate would make sense to me.

Best Regards,

Joost

Vasili

unread,
May 12, 2016, 8:57:02 PM5/12/16
to wrdssas
Great! Thanks for the analytical and swift reply!

jeevan amin

unread,
Mar 17, 2017, 8:04:53 AM3/17/17
to wrdssas
Hello, 

Can you please tell me the difference between  “datadate” and “srcdate” variables in Compustat? It will be really very helpful. Thanks. 

Bo Li

unread,
Oct 14, 2021, 9:46:19 AM10/14/21
to wrdssas
Hello Joost,

Thank you very much for being so helpful with the Compustat segment data. I learn a lot from this conversation. I just have one quick question: form some firm-year level data (gvkey - datadate level), there is no matched observation in Compustat Segment Data. Could I assume that for these firms, they do not report segment so that I could treat their business in a single segment? 

Thank you very much in advance!

Best Regards,
Bo Li,
Arizona State University. 

joost impink

unread,
Oct 14, 2021, 9:48:36 AM10/14/21
to Bo Li, wrdssas
hi Bo,

The segment data is populated with the segment footnote. If there is no such footnote, then there are no entries in the segment dataset. It is a pretty safe assumption that in that case the firm is active in a single segment.

Best,

Joost


--
Forum home on www.wrds.us - http://www.wrds.us/index.php/forum_wrds/
 
All posts are moderated.
---
You received this message because you are subscribed to the Google Groups "wrdssas" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wrdssas+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wrdssas/655a2632-bf2b-4b6e-b34c-cc41a5c50312n%40googlegroups.com.

Bo Li

unread,
Oct 15, 2021, 9:44:04 AM10/15/21
to wrdssas
Hi Joost,

Thank you very much! I really appreciate that you help answer this question, which is very important for my research! 

I would also thank this WRDSSAS group in general so that I could see the key explanation you have in the conservation so that I learned a lot about the segment data. 

Is this group organized by WRDS?

Thank you again and hope you enjoy a wonderful day! 

Best,
Bo 

Reply all
Reply to author
Forward
0 new messages