F ballpark issues

46 views
Skip to first unread message

Jemery Day

unread,
Jul 9, 2025, 11:00:27 PMJul 9
to SS3 - Forum

Hi,

I am having some problems with a model with F_Ballpark playing a larger role than I think it should. 

In previous Synthesis models I have worked on, F_Ballpark was set (at a roughly reasonable value), presumably helped the model get to a good place in the early phases and usually didn't have much influence. I have a likelihood profile from an assessment that I conducted in 2018 where, over a parameter range (up and down from the optimum) where the total likelihood changes by about 2 likelihood units (so more than the critical 1.92). F_Ballpark changes by about 0.15 likelihood units in this range – so has an influence – but it is really small. Other likelihood components are much more influential with changes, in that same parameter range, ranked asfollows: index/CPUE data (3.5 units), discard (3 units), age (1 unit), length (0.5 units), recruitment (about 0.16 units) and then F_Ballpark (0.15 units).

So here F_Ballpark had an influence – but it wasn’t driving this 2018 assessment.

Today I’m working on a swordfish assessment and a similar likelihood profile has F_ballpark as the most influential likelihood component, in my profile on log(R_0). That seems to be problematic!

When I try to phase out F_Ballpark (set the lambda to zero in the last estimation phase) some configurations of the model blow up with log(R_0) hitting an upper bound and an unrealistically high population.

My question is, how can I tame F_Ballpark, to allow it to influence the model to prevent it estimating an almost infinite initial biomass, but not allow that setting to drive my assessment results. I would hope that the data would be more informative than the setting I choose (e.g. 0.2 in 2001, chosen somewhat arbitrarily) for F_Ballpark


In the likelihood report section of my Report.sso file I have the following lines:
    F_Ballpark 1.78656 1
    F_Ballpark(info_only)_2001_estF_tgtF 0.0302054 0.2

So the likelihood associated with this is 1.78656 (for my optimised model) – and the second line (for info only) suggests that while the tgtF is 0.2 (the setting I provided) and the estF is almost an order of magnitude lower at 0.0302054.
 
I’m not sure how estF is being calculated – and am wondering if that can be used to choose a less arbitrary value for F_ballpark and 0.2. That way, if I’m forced to include a value for F_Ballpark that is not phased out in the last estimation phase, at least it is a reasonable value?

ciao

Jemery

Michael Schirripa

unread,
Jul 10, 2025, 10:11:37 AMJul 10
to Jemery Day, SS3 - Forum
Jemery;

Which F method are you using? 

4 # F_Method:  1=Pope midseason rate; 2=F as parameter; 3=F as hybrid; 4=fleet-specific parm/hybrid (#4 is superset of #2 and #3 and is recommended)
4 # max F (methods 2-4) or harvest fraction (method 1)

Michael

--
You received this message because you are subscribed to the Google Groups "SS3 - Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ss3-forum+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ss3-forum/3a278404-c3c3-46a9-942e-e38465e1987bn%40googlegroups.com.

Richard Methot - NOAA Federal

unread,
Jul 10, 2025, 11:08:40 AMJul 10
to Michael Schirripa, Jemery Day, SS3 - Forum
The approach is designed to work with all F methods.  I suggest doing a R0 profile to find out which likelihood component is pushing result to high biomass. 
Rick

Richard  D. Methot Jr.

Stock Assessment Research Scientist (ST)

Northwest Fisheries Science Center

NOAA Fisheries | U.S. Department of Commerce

Office: (425) 666-9893

Mobile: (301) 830-2454

www.fisheries.noaa.gov 




Message has been deleted

jemer...@gmail.com

unread,
Jul 11, 2025, 3:54:31 AMJul 11
to Richard Methot - NOAA Federal, Michael Schirripa, SS3 - Forum

Hi Rick,

 

Thanks for the answers and comments. I tried to reply earlier, through the Google Groups website – but that reply seems to be lost in the cloud – so sorry if I’m repeating myself here…

 

First some settings from my control file to answer Michael’s question:

 

4 # F_Method:  1=Pope midseason rate; 2=F as parameter; 3=F as hybrid; 4=fleet-specific parm/hybrid (#4 is superset of #2 and #3 and is recommended)

2.9 # max F (methods 2-4) or harvest fraction (method 1)

 

I’m not sure if this plot will get through Google groups - but I am pasting my profile on R_0 below (and I emailed it through to you separately Rick, just in case):

 

wMpVFUMPjclUwAAAABJRU5ErkJggg==

 

There are clearly some issues at a couple of points (potentially convergence) on this profile, but the general influences can be seen by the smooth parts of this curve, and in order, my interpretation says that the components pushing the result to lower biomasses are F_Ballpark (largest influence on this profile), weight-composition data (generalised size data) and, with minimal influence, recruitment.  The influences pushing the result to higher biomass are, again in order, priors (I have a prior on M), index (CPUE in my case), length and age data (conditional age-at length data). Having F_Ballpark and the priors as the two most influential components of the likelihood profile seems concerning, to me.

 

I have some specific questions on F_Ballpark.

 

  1. In the section of the report file labelled “LIKELIHOOD report:2” there is a section on F_Ballpark (info only). Maybe I am overinterpreting this (or misinterpreting but the section labelled estF_tgtF caught attention – and it looks to me like the estF reported there is very small (around 0.03) perhaps indicating a much smaller F than I was supplying with F_Ballpark (0.2 – which I thought was reasonable – in the ballpark). Why is estF so low? What does it represent? Should I reduce the value of F_Ballpark in response to this – especially given that the value I am using for F_Ballpark seems to be determining my estimate of population scale (not a very comfortable place to be in)?

  2. Can you offer an opinion on the validity of leaving F_ballpark active? Reminder, when I phase F_Ballpark out (my preferred option), using a lambda of zero in the last phase, the estimate of R_0 hits the upper bound (the population blows up!). If I leave F_Ballpark in, that seems to set the scale for my assessment, in a completely subjective manner – and, if I interpret estF correctly, perhaps suggests I need to choose a lower value of F_Ballpark, a value so low that seems unlikely, to me. I’m struggling to interpret the meaning and scale of F_Ballpark given the very low value of reported estF.

  3. In earlier versions of SS, I understand that F_Ballparjk was automatically phased out in the latter stages of estimation. That appears to have been relaxed to allow users to have control – but the new default appears to be to leave F_Ballpark on. Why has that default been switched, given it seems to encourage users not to phase out the influence of F_Ballpark? I understand that this is the opposite of what may be considered to be best practice (noting that in most cases F_Ballpark isn’t influential).

 

An alternative explanation is that my data contains no useful information on estimating the scale of the population, and if that is need for management advice, then perhaps this assessment is not fit for purpose, and other approaches (such as data poor approaches?) may be more appropriate?

 

ciao

 

Jemery

Michael Schirripa

unread,
Jul 11, 2025, 10:22:34 AMJul 11
to jemer...@gmail.com, Richard Methot - NOAA Federal, SS3 - Forum
Jemery;

Another question: are you starting the model at a year of no fishing/no catch (i.e. pre-fishing), or after the fishery has started? If the first year of your model is after the fishery has started it can be challenging sometimes to estimate that initial F along with the R0. I have found that it makes for a more stable model if you start the model with no fishery. For my ICCAT swordfish assessment this is 1950. Even if you have to linearly ramp up the fishery catch to the first year you have observed catch, this can oftentimes be a better choice. Just something to consider.

Richard Methot - NOAA Federal

unread,
Jul 11, 2025, 12:04:27 PMJul 11
to SS3 - Forum
Hi Jemery,
One of your earlier messages did hang up for awhile in the Forum's spam filter.  When that happens, it still comes to me to get approved before displaying to everyone.  So, sorry about the delay.

Thanks for including the R profile plot. One issue it highlights is that the generalized size comp data have an opposite signal compared to the age, length and index data.  Then the prior logL profile runs counter to the F ballpark profile.  My recommendation is to examine that basis for the priors to assure that they are needed, and to look into the possible reasons for the generalized size comp running contrary to the other data.  I could not speculate on what is going on there, you'll need to dig in.

Regarding the F ballpark value.  It is compared to the quantity termed annual_F which I think is pretty well described in our webinar on F found here:  https://nmfs-ost.github.io/ss3-website/qmds/webinars.html.
Rick

Jemery Day

unread,
Jul 14, 2025, 11:18:55 AMJul 14
to SS3 - Forum
Thanks Rick,

Just to clarify - I am using F_method 4, and MaxF set to 2.9

I have run a likelihood profile on R_0 on this model (or a very similar precursor) which shows that CPUE, length and age data are all pushing the result towards a higher biomass, as well as a prior on M. The profile did have some convergence issues for some R_0 values (which I have not addressed yet), but even with this, I think this has some useful information from the smooth parts of the curve. I'm happy to share this image (perhaps privately on email?), but I can't seem to paste an image into this forum. The components with the largest contribution to the likelihood  (in the profile) were (in order): F_Ballpark, Priors (only on M), Generalised size data (weight data in my case), index (CPUE in my case), length, age (Conditional age-at-length data) and then recruitment (which is almost flat).

What confuses me is the estF value reported alongside the F_Ballpark value (info only), from the LIKELIHOOD REPORT:2 section of the Report.sso file, that seems to suggest a very low value, as I indicated earlier around 0.0302 when F_ballpark was set at 0.2. I'm trying to make sense of this estF value - and wondering if I need to set the F_ballpark value much lower, e.g. to values that don't seem biologically realistic - like 0.01, 0.02 0.03. Perhaps I am misinterpreting estF_tgtF here?

I'm also interested in your thoughts on phasing out F_ballpark. I understand that the default in Synthesis used to be to have F_Ballpark phased out. That setting has been updated, in recent times, to allow the user to control this - but with the default option (if you do nothing), for F_Ballpark not to be phased out. I'm just wondering why this change was made? Is it not preferable to have that phased out by default.

Can you also offer an opinion on the validity of leaving F_Ballpark turned on (in general) and in particular, in my case, especially when it seems to be influential, and especially when the model doesn't converge when I phase it out.

ciao

Jemery

Richard Methot - NOAA Federal

unread,
Jul 14, 2025, 11:39:35 AMJul 14
to SS3 - Forum
Hi Jemery,
Your profile image did display to the group, although perhaps I saw it because your message copied me directly.  Good if someone else on the forum could verify seeing Jemery's R0 profile in an earlier post.

Regarding phasing out of F_ballpark.  The only reason it no longer automatically phases out was to adopt the same logL phasing approach as for all other logL components.  I decided that giving users control of the rate of phase out was preferable to forcing the original phase-out rate.
I do think that phasing it out at the end is the better approach.  The F_ballpark is intended to be a way to push the model towards reasonable F level early in the model run, then to allow the signal in the data to inform the final answer.    If your final answer still has a positive lambda for F_ballpark, then it is basically a prior on that derived quantity.  That might be OK if you had a multi-species fishery and several other species in that fishery provide information on the F exerted by the fishery, but otherwise you are simply assuming you know the answer before you start.  The same holds true by use of "depletion fleet" to provide the model information on the degree of stock depletion (Bratio).

The data you have certainly seem to be providing contrary signals.  I am particularly interested in observation that length comp pushes model one direction and generalized size comp the other.  The only logical way for that to be true would be if those data were for different range of years of different fisheries that have different selectivity.

I'd be glad to take a look if you want to send your input files to me at nmfs.stock...@noaa.gov

RIck

jemer...@gmail.com

unread,
Jul 15, 2025, 12:53:32 AMJul 15
to Richard Methot - NOAA Federal, SS3 - Forum

Hi Rick,

 

Thanks very much for that response. I think I reached a similar conclusion on the use of F_ballpark.

 

The conflict in length and weight data is confounded by multiple factors, in this assessment, with a large spatial range (ignoring any possibility of change in growth between regions), multiple countries operating in the fishery, very different sampling methodologies and targeting practices in different regions, a mixture of both bycatch and targeted fisheries, the use of different length measurement metrics by different data collection agencies, and the use of processed weights, again with different processing standards in different places, and potentially incorrect conversion factors being used. I have done my best to try to deal with many of these issues – but it has been challenging. Maybe these are just the usual fisheries data issues that we all face in some shape or form?

 

I have approached this conflict by estimating separate selectivities for length fleets and weight fleets, especially those operating in the same region. That said, conflict between weight data and length data seems to be quite common in these pelagic fisheries models that we deal with at SPC, operating over a very large spatial scale.

 

Mark Maunder responded to my message (privately) with a number of really helpful suggestions, both in weighting my data, filtering some of it even more stringently and fixing some selectivities and then heavily downweighting (essentiually to zero) the associated length frequencies, for some of the more problematic fisheries (with inconsistent, episodic, irregular and possibly unrepresentative sampling), and also encouraging me to turn off F_Ballpark.

 

Needless to say, I now have a much more secure base from which to refine the model further, and my likelihood profiles look much more reasonable and F_Ballpark is turned off.

Thanks very much Mark for your help on this problem. I hope you don’t mind me sharing your summary on this publicly – I think they are particularly apt:

 

“You basically have a length based data poor method that assumes asymptotic selectivity. With some tweaks to make it non equilibrium and take account for catch. But not unlike a lot of tuna assessments.”

 

ciao

 

Jemery

 

 

                                                                                                                            

 

 

From: 'Richard Methot - NOAA Federal' via SS3 - Forum <ss3-...@googlegroups.com>
Date: Tuesday, 15 July 2025 at 1:39 am
To: SS3 - Forum <ss3-...@googlegroups.com>
Subject: Re: [SS3] F ballpark issues

--
You received this message because you are subscribed to a topic in the Google Groups "SS3 - Forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ss3-forum/QHFQRTQNr8Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ss3-forum+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ss3-forum/4226a8bd-9e65-4fb4-b5f0-64b0c08c207bn%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages