How to adjust the regression line so that 90% of them are below the line?

I have the following dataset in R:

x <- c(0.1, 3, 4, 5, 9, 12, 13, 19, 22, 25)
y <- c(5, 12, 17, 23, 28, 39, 26, 31, 38, 40)
bd <- data.frame(x, y)

My question is how do I do in R to generate a regression model that best fits so that 90% of the data is below the regression line and the Model estimates the source (zero).

It seems that the geometric model fits this case best. I tried to use the exponential as follows, it starts at the source plus the data is not 90% below the curve.

library(ggplot2)

ggplot(bd,aes(x = x, y = y)) + 
  geom_point() + 
  stat_smooth(method = 'nls', formula = 'y~a*x^b', 
              method.args = list(start = list(a = 1, b = 1)), 
              se = FALSE)

insert the description of the image here

Author: Rui Barradas, 2020-02-23

1 answers

Here are two ways to solve the problem using the quantreg package.

The Formula y = a*x^b can be transformed by applying logarithms and adjusting the resulting model, i.e. a robust regression line to the quantile 0.90.

1. this can be done automatically with the stat_smooth function of the ggplot2 package.

library(ggplot2)

ggplot(bd, aes(x = log(x), y = log(y))) + 
  geom_point() + 
  stat_smooth(method = quantreg::rq, formula = 'y ~ x', 
              method.args = list(tau = 0.9), se = FALSE)

insert the description of the image here

2. one can also make an adjustment and calculate the values adjusted.

fit90 <- quantreg::rq(log(y) ~ log(x), tau = 0.90, data = bd)
xnew <- seq(min(x), max(x), length.out = 50)
y90 <- exp(predict(fit90, newdata = data.frame(x = xnew)))
pred90 <- data.frame(x = xnew, y = y90)

ggplot(bd, aes(x, y)) +
  geom_point() +
  geom_line(data = pred90, aes(x, y), colour = "blue")

insert the description of the image here

 2
Author: Rui Barradas, 2020-03-02 23:09:09