Random number generation in ggplot and Shiny

43 views
Skip to first unread message

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ

unread,
Apr 24, 2017, 1:27:56 PM4/24/17
to Shiny - Web Framework for R

I am building a plot of Net Present Value (NPV), using FinCal package, and its odds. For the NPV, the cash-flows are simulated using a triangular distribution for sales, normal distribution for costs and so on. So, here is a snippet of what I am doing:


npvCdf
<- function(n) {
  N
<- sort(n)
  P
<- ecdf(N)
 
return(P)
}

makePlot
<- function(C, m) {
  N
<- m$NPV / C$MILLION
  P
<- npvCdf(N)
 
#
 
# NPV distribution curve
  n
<- sort(N)
  p
<- P(n) * 100
  df
<- data.frame(npv = n, odds = p)
 
#
 
# Points of interest
  o
<- C$NPV_BREAK_EVEN_WORST_ODDS
  q
<- round((quantile(n, o)), C$DIGITS)
  e
<- C$NPV_BREAK_EVEN_VALUE
  b
<- P(e) * 100   # THIS IS THE ERROR I CANT FIGURE OUT
  w
<- o * 100
  s
<- getBreakEven(C, m)
 
#
 
# Labels
  npvOdds
<- paste("Odds of break-even : ", b, "%")
  salesThresh
<- paste("Sales threshold : ", s)
  worstCase
<-
    paste
("Worst case (@ 5% odds) : ", q, "million")
 
#
 
# Make plot
 
#
  g
<- ggplot(df, aes(x = npv, y = odds)) +
    geom_line
(colour = "blue") +
    labs
(title = "NPV and Odds") +
    labs
(x = "NPV (million)") +
    labs
(y = "Percent (%)") +
    geom_vline
(xintercept = e,
               colour
= "red",
               linetype
= "longdash") +
    geom_hline
(yintercept = b,
               colour
= "green",
               linetype
= "longdash") +
    geom_vline
(xintercept = q,
               colour
= "green",
               linetype
= "dotdash") +
    geom_hline
(yintercept = w,
               colour
= "red",
               linetype
= "dotdash")

The C is a data frame of all the constants that are used for calculations of cash-flows, NPV calculations, etc. For example, C$MILLION=1000000 used to divide NPV for simpler representation. The m is a data-frame of sales, cash-flows and NPV per simulation. The simulations are used for cash-flows (triangular distribution), variable cost (normal distribution) and so on.

And, here is the Shiny code that uses the above snippet.


library(shiny)
source
("../npd-c.R")
# Define server logic
shinyServer
(function(input, output) {
  output$npdPlot
<- renderPlot({

    C
<- data.frame(2017,5000,1000000,3,100,500000,0.0,0.05,0.1,
                    input$salesRange
[1],
                    input$salesRange
[2],
                    input$salesMode
,
                    input$demDeclMean
,
                    input$demDeclSd
,
                    input$varCostMean
,
                    input$varCostSd
,
                    input$fixedCostRange
[1],
                    input$fixedCostRange
[2]
                   
)
    names
(C) <-
      c
(
       
"SEED",
       
"ITERATIONS",
       
"MILLION",
       
"DIGITS",
       
"PRICE",
       
"OUTLAY",
       
"NPV_BREAK_EVEN_VALUE",
       
"NPV_BREAK_EVEN_WORST_ODDS",
       
"HURDLE_RATE",
       
"SALES_TRIANG_MIN",
       
"SALES_TRIANG_MAX",
       
"SALES_TRIANG_MODE",
       
"DEM_DECL_FACTOR_MEAN",
       
"DEM_DECL_FACTOR_SD",
       
"VAR_COST_RATE_MEAN",
       
"VAR_COST_RATE_SD",
       
"FIX_COST_RATE_MIN",
       
"FIX_COST_RATE_MAX"
     
)

    n
<- npd(C,-1)
    g
<- makePlot(C,n)
    g
 
})
})

The problem is as follows.


The same code when run in R, I get the plot right in terms of the NPV curve, horizontal and vertical lines. Whereas, when run as a Shiny application, the horizontal and vertical lines are hugely displaced. This is despite, hiving of the NPV and cash-flows code into a separate .R file and setting the same seed value for both the Shiny and non-shiny versions. For example, P(0)=40.07 without Shiny and P(0)=4.7 with Shiny application.


Here are the plots - with and without Shiny. Notice how the curve looks same but, the intersection of horizontal and vertical lines are off. imagebin.ca/v/3K47HvFC6Gt8 (with Shiny) and imagebin.ca/v/3K46LXO7ZvKs (without Shiny)

Bárbara Borges

unread,
Apr 24, 2017, 2:55:17 PM4/24/17
to Shiny - Web Framework for R
Can you share a minimal reproducible example? It's pretty hard to try to make sense of the problem without it...

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ

unread,
Apr 25, 2017, 11:59:42 PM4/25/17
to Shiny - Web Framework for R
Since this is only for my learning purposes, I am sharing with you my Github repository. https://github.com/ProgramErgoSum/npv

Basically, I am doing a Monte-Carlo simulation using certain distributions for sales, costs and demand. The idea is to arrive at NPV values for various iterations of these distributions where every iteration results into a cash-flow for 4 years. Then, find out the odds of NPV=0 and the corresponding sales figure. Finally, also find out the NPV at worst case scenario i.e. odds of 0.05 (5%).

The results of the simulation with and without Shiny, despite using same seed value, do not match. For example, with seed value of 2017, P(NPV=0) with Shiny is 4.7% and without Shiny is 40.06%. This is the problem I want to solve.

Hope it helps.

Joe Cheng

unread,
Apr 26, 2017, 5:01:59 PM4/26/17
to ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ, Shiny - Web Framework for R
Are you saying that literally these lines of code:

  set.seed(C$SEED)
  n <- npvCalc(C, threshold)
  d <- npdDistribution(C, n)
  p <- pointsOnPlot(C, n)
  t <- buildOutputTable(C, p)

executed inside vs. outside of Shiny, give you different results?


--
You received this message because you are subscribed to the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to shiny-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/shiny-discuss/f1c5533f-43f0-4e55-9ce4-5548094e76f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ

unread,
Apr 26, 2017, 10:38:43 PM4/26/17
to Shiny - Web Framework for R, nages...@gmail.com
Yes; that's the observation.

Maybe it is a result of an error elsewhere; but, I can't see where.

Regards,
Nagesh

Bárbara Borges

unread,
May 1, 2017, 8:52:18 PM5/1/17
to Shiny - Web Framework for R, nages...@gmail.com
It does have to be an error elsewhere though. If you try this without Shiny, you still get different answer every time. For example, here's a few runs with the same input values (I added a label to the plot produced by `npvOddsPlot()`, so that the difference was obvious):



# source(npd-c.R)

input <- list(
    varCostSd = 3,
    salesMode = 6000,
    demDeclMean = 0.9,
    fixedCostRange = c(36000, 44000),
    salesRange = c(500, 24000),
    hurdleRate = 0.1,
    varCostMean = 20,
    demDeclSd = 0.1,
    outlay = 0.5
)

npvOddsPlot(input)


npvOddsPlot(input)



npvOddsPlot(input)

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)

unread,
May 1, 2017, 9:09:37 PM5/1/17
to Bárbara Borges, Shiny - Web Framework for R
Did you set a seed value for each of the runs?

Bárbara Borges

unread,
May 1, 2017, 9:19:06 PM5/1/17
to Shiny - Web Framework for R, bar...@rstudio.com
I had not (I though the code was supposed to take care of that - I just skimmed through it).

Here's the code with some changes (I changed C to consts because that's an R function, and it's best to avoid conflicts).

Here's the full code:

library(shiny)
library(FinCal)
library(dplyr)
library(triangle)
library(ggplot2)

SEED <- 2017
ITERATIONS <- 5000
MILLION <- 1000000

salesIter <- function(consts, threshold = -1) {
  y1 <-
    rtriangle(ITERATIONS,
      consts$SALES_TRIANG_MIN,
      consts$SALES_TRIANG_MAX,
      consts$SALES_TRIANG_MODE)
  demDecl <-
    rnorm(
      ITERATIONS,
      mean = consts$DEM_DECL_FACTOR_MEAN,
      sd = consts$DEM_DECL_FACTOR_SD
    )
  
  yr1Demand <- round(y1, consts$DIGITS)
  y2 <- round(yr1Demand * demDecl, consts$DIGITS)
  yr2Demand <- ifelse(y2 <= threshold, 0.0, y2)
  yr3Demand <- round(yr2Demand * demDecl, consts$DIGITS)
  yr4Demand <- round(yr3Demand * demDecl, consts$DIGITS)
  
  sales <-
    data.frame(yr1Demand,
      yr2Demand,
      yr3Demand,
      yr4Demand)
  names(sales) <- c("Y1Sales", "Y2Sales", "Y3Sales", "Y4Sales")
  
  return (sales)
}

revenueIter <- function(consts, sales) {
  revenue <-
    data.frame(
      round(sales$Y1Sales * consts$PRICE, consts$DIGITS),
      round(sales$Y2Sales * consts$PRICE, consts$DIGITS),
      round(sales$Y3Sales * consts$PRICE, consts$DIGITS),
      round(sales$Y4Sales * consts$PRICE, consts$DIGITS)
    )
  names(revenue) <-
    c("Y1Revenue", "Y2Revenue", "Y3Revenue", "Y4Revenue")
  
  return (revenue)
}

varCostIter <- function(consts, sales) {
  varCostRate <-
    rnorm(ITERATIONS,
      mean = consts$VAR_COST_RATE_MEAN,
      sd = consts$VAR_COST_RATE_SD)
  varCost <-
    data.frame(
      round(sales$Y1Sales * varCostRate, consts$DIGITS),
      round(sales$Y2Sales * varCostRate, consts$DIGITS),
      round(sales$Y3Sales * varCostRate, consts$DIGITS),
      round(sales$Y4Sales * varCostRate, consts$DIGITS)
    )
  names(varCost) <-
    c("Y1VarCost", "Y2VarCost", "Y3VarCost", "Y4VarCost")
  
  return (varCost)
}

fixedCostIter <- function(consts) {
  fixedCost <-
    runif(ITERATIONS,
      min = consts$FIX_COST_RATE_MIN,
      max = consts$FIX_COST_RATE_MAX)
  
  return (fixedCost)
}

cashFlow <- function(consts, revenue, varCost, fixedCost) {
  cashFlow <-
    data.frame(
      revenue$Y1Revenue - varCost$Y1VarCost - fixedCost,
      revenue$Y2Revenue - varCost$Y2VarCost - fixedCost,
      revenue$Y3Revenue - varCost$Y3VarCost - fixedCost,
      revenue$Y4Revenue - varCost$Y4VarCost - fixedCost
    )
  names(cashFlow) <- c("Y1", "Y2", "Y3", "Y4")
  
  return (cashFlow)
}

cashFlowNpv <- function(consts, cashFlow) {
  n <- NULL
  outlay <- -1 * consts$OUTLAY * MILLION
  j <- ITERATIONS
  for (i in c(1:j)) {
    j <- cashFlow[i:i, ]
    yearlyCashFlow <- c(outlay, j$Y1, j$Y2, j$Y3, j$Y4)
    n <-
      append(n, round(npv(
        r = consts$HURDLE_RATE, cf = yearlyCashFlow
      ), consts$DIGITS))
  }
  cashFlowNpv <- mutate(cashFlow, npvY0 = n)
  
  return (cashFlowNpv)
}

salesCashFlowNpv <- function(consts, sales, cfNpv) {
  salesCashFlowNpv <- data.frame(
    Y1Sales = sales$Y1Sales,
    Y1CashFlow = cfNpv$Y1,
    Y2Sales = sales$Y2Sales,
    Y2CashFlow = cfNpv$Y2,
    Y3Sales = sales$Y3Sales,
    Y3CashFlow = cfNpv$Y3,
    Y4Sales = sales$Y4Sales,
    Y4CashFlow = cfNpv$Y4,
    NPV = cfNpv$npvY0
  )
  
  return(salesCashFlowNpv)
}

npvCdf <- function(n) {
  N <- sort(n)
  P <- ecdf(N)
  return(P)
}

getBreakEven <- function(consts, n) {
  x <- filter(n, NPV >= 0)
  breakEvenNpv = min(x$NPV)
  breakEvenSales = round(select(filter(x, round(NPV, consts$DIGITS) == breakEvenNpv),
    Y1Sales)$Y1Sales, 0)
  
  return(breakEvenSales)
}

npdDistribution <- function(consts, m) {
  N <- m$NPV / MILLION
  P <- npvCdf(N)
  #
  # NPV distribution curve
  n <- sort(N)
  p <- P(n) * 100
  df <- data.frame(npv = n, odds = p)
  
  return(df)
}

pointsOnPlot <- function(consts, m) {
  # Points of interest
  s <- getBreakEven(consts, m)
  o <- consts$NPV_BREAK_EVEN_WORST_ODDS
  w <- o * 100
  n <- m$NPV / MILLION
  n <- sort(n)
  q <- round((quantile(n, o)), consts$DIGITS)
  e <- consts$NPV_BREAK_EVEN_VALUE
  b <- npvCdf(m$NPV / MILLION)(0) * 100
  
  p <- data.frame(s, o, w, q, e, b)
  names(p) <- c("s", "o", "w", "q", "e", "b")
  
  return(p)
}

makePlot <- function(d, p) {
  # Labels
  npvOdds <- paste("Odds of break-even : ", p$b, "%")
  salesThresh <- paste("Sales threshold : ", p$s)
  worstCase <-
    paste("Worst case (@ 5% odds) : ", p$q, "million")
  #
  # Make plot
  #
  g <- ggplot(d, aes(x = d$npv, y = d$odds)) +
    geom_line(colour = "blue") +
    labs(title = "Net Present Value (NPV) and Odds") +
    labs(x = "NPV (million)") +
    labs(y = "Percent (%)") +
    geom_vline(xintercept = p$e,
      colour = "red",
      linetype = "longdash") +
    geom_hline(yintercept = p$b,
      colour = "green",
      linetype = "longdash") +
    geom_vline(xintercept = p$q,
      colour = "green",
      linetype = "dotdash") +
    geom_hline(yintercept = p$w,
      colour = "red",
      linetype = "dotdash") +
    annotate("text", x=4, y=75, label=npvOdds)
  return (g)
}

buildOutputTable <- function(consts, p) {
  outputTable <-
    data.frame(consts$OUTLAY, consts$HURDLE_RATE * 100, p$b, p$s, p$q)
  names(outputTable) <-
    c("Outlay (mn)",
      "Hurdle Rate (%)",
      "BE Odds (%)",
      "BE Sales",
      "Wconsts Loss (mn)")
  
  return(outputTable)
}

npvCalc <- function(consts, threshold = -1) {
  s <- salesIter(consts, threshold)
  r <- revenueIter(consts, s)
  v <- varCostIter(consts, s)
  f <- fixedCostIter(consts)
  c <- cashFlow(consts, r, v, f)
  n <- cashFlowNpv(consts, c)
  t <- salesCashFlowNpv(consts, s, n)
  
  return (t)
}

npvPlot <- function(consts, threshold = -1) {
  set.seed(consts$SEED)
  n <- npvCalc(consts, threshold)
  d <- npdDistribution(consts, n)
  p <- pointsOnPlot(consts, n)
  g <- makePlot(d, p)
  
  return(g)
}

npvTable <- function(consts, threshold=-1) {
  set.seed(consts$SEED)
  n <- npvCalc(consts, threshold)
  d <- npdDistribution(consts, n)
  p <- pointsOnPlot(consts, n)
  t <- buildOutputTable(consts, p)
  
  return(t)
}

getConstants <- function(input) {
  consts <- data.frame(
    SEED,
    3,
    100,
    0.0,
    0.05,
    input$outlay,
    input$hurdleRate,
    input$salesRange[1],
    input$salesRange[2],
    input$salesMode,
    input$demDeclMean,
    input$demDeclSd,
    input$varCostMean,
    input$varCostSd,
    input$fixedCostRange[1],
    input$fixedCostRange[2]
  )
  names(consts) <-
    c("SEED",
      "DIGITS",
      "PRICE",
      "NPV_BREAK_EVEN_VALUE",
      "NPV_BREAK_EVEN_WORST_ODDS",
      "OUTLAY",
      "HURDLE_RATE",
      "SALES_TRIANG_MIN",
      "SALES_TRIANG_MAX",
      "SALES_TRIANG_MODE",
      "DEM_DECL_FACTOR_MEAN",
      "DEM_DECL_FACTOR_SD",
      "VAR_COST_RATE_MEAN",
      "VAR_COST_RATE_SD",
      "FIX_COST_RATE_MIN",
      "FIX_COST_RATE_MAX"
    )
  
  return(consts)
}

npvOddsPlot <- function(input) {
  consts <- getConstants(input)
  threshold <- -1
  g <- npvPlot(consts,threshold)
  return (g)
}

npvOddsTable <- function(input) {
  consts <- getConstants(input)
  df <- npvTable(consts)
  return(df)
}

ui <- fluidPage(
  titlePanel("New Product Introdction Demo"),
  
  sidebarLayout(
    sidebarPanel(
      sliderInput("outlay", 
        label = "Initial Outlay (million)",
        min = 0.1, max = 10, value = 0.5),
      sliderInput("hurdleRate", 
        label = "Hurdle Rate",
        min = 0.01, max = 1, value = 0.1),
      sliderInput("salesRange", 
        label = "Range of sales",
        min = 100, max = 50000, value = c(500, 24000)),
      sliderInput("salesMode", 
        label = "Mode of sales",
        min = 1000, max = 40000, value = 6000),
      sliderInput("demDeclMean", 
        label = "Demand declining factor - Mean",
        min = 0.0, max = 1.0, value = 0.9),
      sliderInput("demDeclSd", 
        label = "Demand declining factor - SD",
        min = 0.0, max = 1.0, value = 0.1),
      sliderInput("fixedCostRange", 
        label = "Fixed Costs - Min and Max",
        min = 30000, max = 50000, value = c(36000, 44000)),
      sliderInput("varCostMean", 
        label = "Variable Costs - Mean",
        min = 10, max = 50, value = 20),
      sliderInput("varCostSd", 
        label = "Variable Costs - SD",
        min = 1, max = 10, value = 3)
    ),
    mainPanel(
      plotOutput("npdPlot"),
      tableOutput("npdTable")
    )
  )
)

server <- function(input, output, session) {
  output$npdPlot <- renderPlot({
    print(reactiveValuesToList(input))
    npvOddsPlot(reactiveValuesToList(input))
  })
  output$npdTable <- renderTable({
    npvOddsTable(input)
  })
}

shinyApp(ui, server)

It produces this Shiny app:


If you source the functions above and then run:

input <- list(
    varCostSd = 3,
    salesMode = 6000,
    demDeclMean = 0.9,
    fixedCostRange = c(36000, 44000),
    salesRange = c(500, 24000),
    hurdleRate = 0.1,
    varCostMean = 20,
    demDeclSd = 0.1,
    outlay = 0.5
)

npvOddsPlot(input)

you get the same plot:

Reply all
Reply to author
Forward
0 new messages