Financing policies via stochastic control: a dynamic programming approach

This paper deals with a theoretical stochastic dynamic optimization model for the external financing of firms. We aim at searching for the best intensity of payment that a financier has to apply to a company in order to have a loan repaid. The techniques involved are related to the optimal control theory with exit time. We follow a dynamic programming approach. Our model also presents a distinction between the legal and the illegal financier, and a theoretical comparison analysis of the results is presented. Some numerical examples provide further validation of the theoretical results.


Introduction
In the breakthrough paper of Modigliani and Miller (1958), the authors used an arbitrage argument to prove the separability of corporate financing and investment decisions, when perfect capital market assumptions hold. Modigliani and Miller's result can be summarized as stating the irrelevance of capital structure in the evaluation of the firm value.
In actual fact, the life of a firm can be influenced by several events, the impact of which can drastically change the evolution of the dynamics associated to the firm value. We focus in this paper on a relevant issue of company wealth, that is the external financing.
Part of the literature on mathematical models for company external financing relies on decision theory. Indeed, the scientific research on this microeconomic subject often provides an answer to some simple questions, e.g.: Q2 What is the best financing strategy that a company holder should follow to maximize the wealth of the company?
In both of cases, an optimization model has to be constructed and developed. Moreover, due to the randomness and the evolutive nature of the economic environment with which we work, a dynamic stochastic optimization model should be proposed. Brennan and Schwartz (1978) is the starting point of quantitative studies in terms of searching for the optimal external financing strategy. They perform a numerical analysis to determine the optimal leverage when the wealth of the firms follows a diffusion process with constant volatility. Brennan and Schwartz's paper is undoubtedly relevant, although the lack of a purely theoretical perspective does not allow closed form solutions to Q2. In this respect, Leland (1994) shows closed form solutions for debt values and equity values assuming infinite life for the debts. In Leland and Toft (1996), the (very restrictive) assumption of infinite life debt is removed.
More recently, Sethi and Taksar (2002) focus on problem Q2. In fact, the authors consider the problem of searching for the best financing mix of retained earnings and external equity in a stochastic framework, in order to maximize the value of a company. For their purposes, they formulate and explicitly solve a singular stochastic control problem. Sethi and Taksar (2002) is the stochastic extension of the deterministic model stated in Krouse and Lee (1973) and improved in Sethi (1978). Caballero and Pindyck (1996) also provide an answer to Q2. They examine the sources of randomness in company investments and the effects of external financing on the incomes of an industrial system. Their approach is to use dynamic optimization tools with a dynamic programming perspective. The authors extend and complement Dixit (1989) and Leahy (1991): indeed, on one hand they adopt the viewpoint of these papers and focus on the entry or exit decisions; on the other hand, in contrast with the quoted papers, they emphasize the effects of different sources of uncertainty on company financing policies.
The problem proposed in Q1 belongs to the standard theory of corporate finance, and there is a large amount of literature dealing with the analysis of the best loan interest rate that a financier should apply. For a detailed description of this subject, we refer to the monographs (Brealey et al. 2006;Damodaran 2006;Tirole 2006).
Despite its relevance, problem Q1 has rarely been studied with quantitative optimization techniques. It is worth citing some relevant recent contributions. Stanhouse and Stock (2008) discuss the optimal rate that a bank should charge on a loan to maximize the expected profit in presence of a prepayment risk. The authors analyze also the relationship between such a loan interest rate and the maturity of the loan. Chang and Lin (2006) analyze the role of the interconnections among the banks in determining the optimal loan interest rate. They deal with an option approach, and show the relationship between the optimal loan interest rate and the degree of the capital market imperfections. Kahn et al. (2005) focus on the bank behavior regarding customer loan market, with a particular emphasis on the dynamics of the loan interest rates. Cifarelli et al. (2002) propose a model for the choice of the best intensity of payment that a legal or illegal financier has to apply to a firm in order to have a loan repaid. The authors extend Masciandaro et al. (1997), and analyze the ruin probability of the firm via the theory of differential equations. The firm wealth evolves accordingly to a stochastic differential equation, which has also been encountered in Li et al. (1996) in a different setting, where the stochastic intensity of the debt restitution appears in the drift coefficient as an additive term. As an extension of Cifarelli et al. (2002), see also Barone et al. (2011) for the case of illegal financier and Cerqueti and Quaranta (2011) for legal financiers.
We contribute to this strand of literature by dealing with problem Q1 in a dynamic stochastic optimization perspective by following a dynamic programming approach. More precisely, we consider a company and a financier operating in a dynamic stochastic environment and construct a mathematical model for external financing, adopting the financier's point of view. Definitively, we search for the best, in some sense, payment flow that a financier has to apply to a firm in order to have a debt repaid.
In our framework, the payment flows have to be intended as the annuities that the funded company corresponds to the financier to repay the debt. We notice that the most part of the literature on this field refers to loan interest rates. The preference we accord to the analysis of the payment flows is grounded on two basic principles. By one hand, payment flows and loan interest rates are strongly interconnected, and it is possible to derive information on one of them starting from the other. Hence, our arguments continue to be meaningful when relying to loan interest rates instead of payment flows. By the other hand, as we shall see below, the introduction of the payment flows implies a rather complicated model, that can be treated only by using sophisticated mathematical tools. Therefore, the development of our model leads also to some interesting contributions by a purely theoretical perspective.
Furthermore, we maintain the distinction discussed in Cifarelli et al. (2002) between the legal and the illegal financier. In fact, a suitable model for the external financing should take the differences between the financiers into account. The targets of a bank and of an illegal financier are reasonably not the same: the bank aims at maximizing its profit and is not interested in the failure of the firm; the usurious financier uses illegal markets to get a profit from the failed firm. To this end, it takes up the position which will bring about the bankruptcy of the financed subject.
As we stated before, the choice of the best time-varying stochastic flow of payment is performed by solving a dynamic stochastic optimization problem. For a collection of optimization techniques applied to real life problems, refer to AitSahlia et al. (2008), Pardalos and Tsitsiringos (2002) and the monumental work of Christodoulos and Pardalos (2009).
In our setting, the problem is studied up to the time when the debt is completely repaid or the company fails. Since the date of success or failure of the firm is not fixed a priori, our optimization problem has a stochastic time horizon, endogenously determined by the dynamics of the firm wealth.
A brief discussion on the stochastic control theory is now needed. The term mathematical control theory was introduced about half a century ago. Despite this fact, the nature of the optimal control problem has been the focus of researches into optimization since the fifteen century. The precursor of the techniques involved in optimal control is commonly seen in the calculus of variations. For a very interesting survey of the early optimization problems, we suggest Yong and Zhou (1999, Historical Remarks, p. 92). Bellman was one of the first to point out the need to introduce the randomness into the optimal control theory and to mention the stochastic optimal control theory (Bellman 1958). Nevertheless, stochastic differential equations and Ito's Lemma were not involved in Bellman (1958) and the first paper dealing with diffusion systems, Markov processes and differential equations was Florentin (1961). Nowadays, the literature in this field is growing as it is applied to economics, biology, finance, engineering and so on.
The keypoint of the optimal control theory is represented by an optimization problem, where the constraints are associated to some functions properties (controls α), which are elements of a certain functional space (admissible region A). Thus, the objective function J is a functional which depends on the controls. The optimum with respect to the controls of such objective functional is called the value function V .
The stochastic framework is related to the analysis of cases with admissible region given by stochastic process spaces.
Starting from the objective functional and the definition of the admissible region, there are basically two methods of proceeding: Stochastic Maximum Principle (strongly related to the martingale theory) and Dynamic Programming (bringing in the theory of differential equations). In the former case, a set of necessary conditions for stochastic optimal controls are provided through forward-backward stochastic differential equations for adjoint variables and related stochastic Hamiltonian systems. In the latter case, one has to prove an optimality principle, named the Dynamic Programming Principle, and relate the value function to the (classical) solution (if it exists, if it is unique) of a differential equation, named the Hamilton Jacobi Bellman (HJB) equation.
In this paper, we adopt this second point of view. For our purposes, we use a Dynamic Programming Principle for stochastic control problems with exit time recently proved in Cerqueti (2009) via analytic techniques.
The HJB equation states formally, in the sense that we derive it by using the Dynamic Programming Principle, assuming the appropriate regularity of the value function. Since the value function is generally not regular enough, a weak solution definition is needed: the viscosity solution. For the concept of the viscosity solution, we remind to the seminal works Crandall et al. (1984), Crandall and Lions (1981, 1983, 1987, Lions (1981Lions ( , 1983aLions ( ,c, 1985. For a complete survey, we remind the reader to Barles (1994), Fleming and Soner (1993), Lions (1982) and the celebrated User's Guide of Crandall et al. (1992).
In this work, we prove that the value function V is a classical solution of such differential equation in two steps: in the first one, we prove that V is the unique viscosity solution of the Hamilton-Jacobi-Bellman Equation; the second step concerns the study of the regularity of V .
Then, we find the optimal strategies in feedback form via a Verification Theorem and we provide an economic interpretation of them. Moreover, we analyze the distinction between the legal and the illegal financier, with several comments and suggestions for further research. Lastly, we propose some numerical experiments, in order to show evidence of the usefulness of the Dynamic Programming approach as a technique. The results obtained are totally in agreement with the theoretical findings.
This work is organized as follows. The next section is devoted to the statement of the models. In the third section, the Hamilton-Jacobi-Bellman is derived and solved. The fourth section is devoted to the optimal strategies. In the fifth section, we provide the comparison between the legal and the illegal financier. The sixth section contains some numerical experiments. In the last section, we present our conclusions, with some future research lines. The proofs are relegated in the "Appendix".

The model
The aim of this section is to describe the economic environment of the problem. In particular, we define the state equation and the value function related to our optimization framework and the main assumptions in force throughout the paper are also discussed.
We introduce a probability space with filtration ( , F , {F t } t∈R + , P) on which we define a standard Brownian Motion W with respect to {F t } under P. Here the filtration F t represents the P-augmentation of the natural filtration generated by W , that is where N is the collection of all the sets of measure zero under P, i.e.: Since the Brownian Motion is a continuous process, then the filtration {F t } t∈R + is right continuous. Hence, the filtration satisfies the usual conditions.
The state equation describes the stochastic evolution at time t of the dynamics X (t) associated to the wealth of the firm. It is given by the following controlled stochastic differential equation with initial data.
• μ, σ ∈ R are related, respectively, to the deterministic and stochastic evolution of the firm wealth. • α(·) is a stochastic process F t -progressively measurable and it represents the intensity of payment corresponded by the funded firm to the financier. • X 0 ∈ [0, K ] is the initial wealth of the firm. Formally, it should be a random variable in [0, K ] with law π 0 , that is measurable with respect to F 0 . Since it is reasonable that the initial situation of the funded company is known, we can assume that X 0 = x ∈ [0, K ], x nonrandom. • the standard 1-dimensional Brownian Motion W (·) is independent of X 0 . It drives the stochastic term of the firm wealth evolution.

Remark 1
The bound values 0 and K are absorbing barriers for the dynamic of the wealth of the firm which evolves under the pressure of the payment of the debt. When the wealth of the firm reaches the value 0, then we have the company failure; if the firm wealth reaches the value K , then the loan is extinguished.
We are interested in analyzing the external financing problem up to its natural solution: the firm failure or the complete restitution of the debt. To this end, we need to introduce the random times where the wealth of the company reaches one of the absorbing barriers 0 and K . We denote with T the set of the stopping times in [0, +∞] as follows: The exit time τ of the dynamic from [0, K ] is Since F t satisfies the usual conditions, then it results τ ∈ T .
The intensity of payment that a financier has to apply to a company is defined as the product between the value of the debt and the loan interest rate. Hence, it must obviously be positive. Moreover, it should have an upper bound, as evidence suggests. In our opinion, the upper and lower thresholds could be numerically estimated, by studying a suitable number of financing cases. The arguments above bring us to define the admissible region-the functional space containing the admissible controls-as follows: (4) where the constants δ 1 , δ 2 are upper and lower bounds for the admissible intensity of payment.
The objective of our analysis is to search for the best intensity of payment that a financier has to apply to a company, in order to optimize something. Here, we propose setting the maximization of the expected discounted intensity of payments related to the loan. The payments are effected up to the moment of company failure or total restitution of the debt.
For our purpose, we define the objective functional as where E x is the expected value conditioned to X 0 = x, e −δ is the uniperiodal cost of the capital for the financier, δ ∈ R + and C is the terminal cost given by with A, H, ∈ R + . The constants A and H describe the final amount obtained by the financier when the company goes bankrupt or the debt is totally repaid, respectively.
The maximization problem can be now formalized by the definition of the value function, that is

Hamilton-Jacobi-Bellman equation and verification theorem
This section contains the formalization of the strategy we adopt to solve the stochastic optimal control problem constructed in the previous section: the dynamic programming approach.
Starting from some recent results on the dynamic programming principle, we write the dynamic programming equation and prove that the value function is its unique classical solution.
The dynamic programming principle for a class of stochastic optimal control problems more general than ours has been recently proved in Cerqueti (2009), by using analytical techniques grounded on the measurable selection theory. Hence, we enunciate the dynamic programming principle for our particular setting and omit the proof: where the sup is done over all α admissible controls.
The dynamic programming equation, named Hamilton Jacobi Bellman or HJB Equation, is a direct consequence of Theorem 2 and it is solved by the value function V only under strong regularity conditions (see Lions 1983bLions ,c, 1985. We formalize the dynamic programming equation in our framework as follows: with the relaxed boundary conditions and The optimal strategies of the dynamic stochastic optimization problem we are studying are implied by the existence and uniqueness of the classical solution for the HJB Equation (9) with boundary conditions (10)-(11), as we shall see in the Verification Theorem. Unfortunately, the regularity of the value function is not easy to prove, and the same is true of the existence and uniqueness of the twice differentiable solution of the HJB equation. We are obliged to introduce a weakness aspect in the definition of the solutions of the HJB equation, and prove that V is the unique viscosity solution of the HJB Equation. The Introduction contains also a list of key references for the concept of viscosity solutions of an HJB equation.
The following Existence and Uniqueness Theorem is a consequence of some results from Barles and coauthors (see Barles and Burdeau 1995;Barles and Rouy 1998). (0, K ) and can be extended continuously on [0, K ]. Moreover, V is the unique viscosity solution of the HJB Equation (9) with variational boundary conditions (10) and (11).

Regularity of the value function
Theorem 4 implies that, if the value function is twice differentiable in (0, K ), then it is a classical solution of the HJB equation. We have already mentioned that if V is the unique classical solution of the Hamilton Jacobi Bellman equation, then we can formally discuss the optimal strategies of the stochastic control problem. To this end, we firstly need a result on the concavity of the value function.
For the proof, see the "Appendix".
A further result on the strict concavity of the value function is contained in the next result. The strict concavity of the value function will be used in the analysis of the optimal strategies. Lemma 6 Assume that δ = μ. Then V is strictly concave.
See the "Appendix" for the proof.
We come now to the regularity theorem, which guarantees that the viscosity solution of the HJB equation is a classical solution. For the proof of the following result, see the "Appendix".

Optimal strategies
This section contains the explicit formulation of the optimal solution of our control problem.
The optimal strategies and trajectories of our stochastic control problem can be theoretically identified by proving a Verification Theorem. To reach this goals, we start from the HJB equation stated in Theorem 3. More precisely, we use the fact that the value function is the unique classical solution of the HJB equation, as comes out in Theorems 4 and 7. The proof of the Verification Theorem is contained in the "Appendix".
Then we have Theorem 8 guarantees the existence of the optimal strategies related to our stochastic control problem, from a purely theoretical point of view. This result is grounded on the regularity of the value function, which is a classical solution of the HJB equation. The next step in our work is to provide an explicit form to optimal strategies and trajectories. First of all, we let the notation be less heavy. By Theorems 3, 4 and 7, we can write the Hamilton Jacobi Bellman equation as where The next result explicitly formalizes the value optimizing the operator H a defined in (13).
Proposition 9 Fixed x ∈ [0, K ], the absolute maximum point a * of the function H a defined in (13) is See the "Appendix" for the proof.
Proposition 9 should be related to the optimal intensity of payment that a financier has to apply to a funded company. Moreover, it should also drive the optimal trajectory of the firm wealth. The connections between (14) and the couple (optimal control, optimal trajectory) can be observed by introducing the closed loop equation: The significance of the closed loop equation and the optimal strategies are shown in the next result.
Theorem 10 Let us consider andX the solution of the closed loop equation Then, settingā(t) := a * (X (t)), we have J (x,ā(·)) = V (x) and the pair (ā,X ) is optimal for the control problem.
Theorem 15 explicitly determines the optimal strategies for our stochastic control problem. We devote the next subsection to some further comments on our optimality results.

Some remarks on the optimal strategies
In the discussion about the regularity properties of the value function, we have shown that the value function V is twice differentiable in the space (0, K ) and is concave. Furthermore, under the hypotheses of Lemma 6, we get that V is strictly concave. In this case, there exists the inverse of the function V , being V strictly decreasing and continuous.
Assume μ = δ and denote (V ) −1 =: I . The optimal controls α * are bang-bang in type. We get, by the analysis of HJB equation, By the regularity properties of the value function, being δ = μ and V not a constant function, there exists a unique point x 0 ∈ (0, K ) such that x 0 = I (1).
We decompose the interval (0, K ) as If the initial wealth of the company is small enough (i.e., it belongs to 1 ), then the best intensity of payment that a financier has to apply to a funded company coincides with the larger admissible one; if the initial wealth is larger than x 0 (i.e. it belongs to 3 ), the smaller intensity of payment is the best choice. When the initial wealth of the firm coincides to the critical point x 0 , then the financier has doubts about the best choice, and she/he does not understand whether a large or small payment flow is better.

Comparison between the legal and the illegal financier cases
The scope of this section is to propose and compare two different financing policy models, related to a legal and an illegal financier.
The following important monotonicity result holds Then V is an increasing function.
For the proof, see the "Appendix". The differences between legal and illegal financiers are basically related to the different wealth of a failed company, from a bank's and an usurer's point of view. While the legal financier does not have a positive income from the bankruptcy of a funded firm, the illegal financier is able, in this case, to take a position in illegal markets and obtain a positive amount. Generally, such an amount is larger than the income from the restitution of the debt. Therefore, we have: • Legal financier case: A = 0, H > 0. We will refer to the objective functional and the value function, respectively, as J leg and V leg . • Illegal financier case: 0 < H < A. We will refer to the objective functional and the value function, respectively, as J ill and V ill .
Remark 12 Proposition 11 allows us to understand something about the value functions in both cases. In the case of an illegal financier, we get immediately that the value function V ill cannot be increasing, because H < A is in disagreement with one of the conditions in (17).
In the case of a legal financier, we have to state a condition on the upper and lower thresholds for the admissible intensity of payment in order to get that V leg is increasing.
The illegal financier is expected to obtain more money from the funded firm than the legal financier. This empirical fact can be formalized in our model, as a consequence of Existence and Uniqueness Theorem 4:

Proposition 13 It results
An interesting result for the first derivatives of the value functions in the two cases holds, in agreement with what we pointed out in Remark 12.

Proposition 14 It results
For the proof see the "Appendix".

Numerical experiments
The aim of this section is to propose some numerical experiments to obtain a further validation of our theoretical study. More specifically, we will derive the intensity of payment α which maximizes the objective function J (x, α(·)) defined in (5) for three different starting points for the firm value. We proceed by performing a Monte Carlo simulation. For this purpose, some parameters values are assigned accordingly with the empirical literature: • the initial intensity of payment α is given by the product of the loan interest rate i and the debt amount D. To fit better with the available data, we will deal with the analysis of the optimal loan interest rate i, which will lead to the optimal intensity of payment α; • we consider α varying within a band. The variation range of the payment flow derives from the range [i 1 , i 2 ] of the loan interest rate i, and it should take into account the applicability of our model to the cases of legal and illegal financiers. The upper bound comes out from the studies of the usury phenomenon. Indeed, although one should consider the upper bound of the usury loan interest rate infinity, empirical evidence shows that i 2 = 500% with only 9% of the event which overcomes such a high threshold. 1 So, the upper bound is fixed to δ 2 = 5 · D, while the lower bound is assumed to be δ 1 = i 1 · D = 0 · D, in agreement with the purposes of the illegal financier (to construct a trap for the company, to reinvest illegal money); • we consider μ = 1 + ρ = 1.001, where ρ is the revaluation rate of the company, and σ = 0.01; • our analysis is performed for three different starting points of the state variable: x = 100, x = 500 and x = 900 respectively, to represent firm of small, medium and large size; • the initial amount of the loan D is assumed to be prudentially given as the 20% of the value x. Hence we have D = 20, D = 100 and D = 180 for small, medium and large companies, respectively. • the threshold K is considered equal to 1000; • the parameters A, H, δ are assumed to be 200, 1200, 0.03, respectively.
The simulation procedure is implemented as follows: • the stochastic differential of the Brownian Motion is discretized, as usual: where is a random number extracted by a centered normal distribution while t = 1 (1 day); • a discretization of the range [0, 5] of the loan interest rate i with a step equal to 0.01 is applied, and each step is denoted as i s , with s = 1, . . . , 50, 000; • the time-points are identified as days and we consider 1000 points to construct each trajectory in order to analyze the evolution of the firm wealth for approximately three years; • having fixed i s , we build 1000 trajectories X Let n (i s ) be the number of the τ we calculate for each value of i s the bankruptcy probability as follows: Fix s = 1, . . . , 1000. According to definitions (5) and (6) and replacing the expected value operator with the arithmetic mean, we can write . (20) The value function in (7) can be found by optimizing the functional J . There exists s * ∈ {1, . . . , 1000} such that By performing the simulations, we obtain that i s * = 5. Hence, α s * = 5 · D in the three cases of x = 100, 500, 900. This is in complete agreement with the bang-off-bang optimal control in (16), which has been found theoretically.

Conclusions and further research
In this paper, a model for optimal financing policies via dynamic optimization is proposed. After the construction of a very general theoretical model, stochastic control theory is used, in order to derive the main properties of the model and solve the related optimization problem. The adopted approach is dynamic programming theory, and we use a Dynamic Programming Principle proved in Cerqueti (2009). The concept of viscosity solution is introduced, to study the Hamilton Jacobi Bellman equation. As a further step, the distinction between the legal and the illegal financier is pointed out. The optimal intensity of payment to be applied by the financier depends on the initial wealth of the funded company: if the value of the firm is small (large) enough at the beginning of his life, then the optimal intensity of payment is smaller (larger) as well. There is a critical point for the initial wealth of the firm implying an arbitrary undefined optimal payment flow. It represents the keypoint for the distinction between a company with a large or small initial wealth, in the financier's opinion. Some numerical experiments provide further validation of our theoretical findings. Several open problems come out from the theoretical model proposed in this work. We point out some of them, leaving the related analyses to future research.
2 Starting from i 1 = 0, when the 1000 trajectories X (i s ) j , each composed by 1000 points, are traced, the value i s increases of 0.01 and then in relation to this new value of the loan interest rate we determinate other 1000 trajectories of 1000 points, and so on, till i 50,000 .
• Analysis of the capital structure of the company, assumed to be dependent on a large number of parameters. From this point of view, it is not meaningless to consider μ and σ [see the state Eq. 1] depending on the dynamic X (·). • Influence of the financier in the evolution of the firm wealth (only for illegal financing).
The model can be improved by inserting an additive quantity describing the eventual influence of the investor on the dynamic of the firm, in the deterministic term of the state equation. • Endogenous α. In this context, it is possible to construct an evolution equation for the payment flow α, which depends on the state variable. The control variables should be chosen directly from the constitutive parameters of the dynamic of α. • Empirical analysis. To validate our theoretical model, an empirical analysis based on data from both the legal and the illegal markets could be carried out. Unfortunately, a suitable dataset is not easily available.

Proof of Theorem 5
We introduce the following equation: We obtain this equation starting from (9) as follows: given we obtain (22) as We need to recall some important results, which are useful in order to prove the concavity of the value function.
The proof is omitted. The previous result implies the following corollary.

Corollary 16
If u is the unique viscosity solution of (9), then it results that v := −u is the unique viscosity solution of (22).

Now we recall an important general result.
Lemma 17 (Alvarez et al. 1997) Let us consider the interval I ⊆ R. Assume that the oper-atorH satisfies the following properties: is concave, for every p.

Let us define the convex envelope
Then v * * ∈ L SC(Ī ) is a viscosity supersolution of (24).
Let us now prove the concavity of the value function.
Proof of Theorem 5 By the Existence and Uniqueness Theorem, the value function is continuous in (0, K ) and can be extended continuously on Due to this fact, we need to prove the concavity of the value function in (0, K ). Let us fix 0 < < K /2. We define the interval I : To prove the concavity in (0, K ), we need to prove the concavity in I , for each . The proof articulates itself in four steps.
• First step By Corollary 16, in order to prove the concavity of V it is sufficient to prove that u := −V is a convex function. Indeed, we have to work on (22). Let us now define :H (x, v, p, q) x ∈ (0, K ). (25) • Second step The convex envelope u * * is a viscosity supersolution of (25). In order to prove the claim, we have to check the validity of the hypotheses of Lemma 17. We get thatH and a simple computation gives us that the application is concave for every p. Furthermore, it results, by a direct computation, thatH (x, v, p, q 1 ) ≥H (x, v, p, q 2 ) provided q 1 ≤ q 2 in I . SoH is an elliptic operator. Then we are in the hypotheses of Lemma 17, and the claim is proved.

• Third step
The convex envelope u * * is a viscosity subsolution of (25).
Let us now observe that, if w 1 is a viscosity subsolution and w 2 is a viscosity supersolution of (25), then we get that w 1 ≤ w 2 , by the Existence and Uniqueness Theorem 4. Thanks to this result, we need to prove that u * * ≤ u in order to prove that u * * is the viscosity subsolution of (25). We easily get, by definition of convex envelope, for each x ∈ I , with the choice λ 1 = 1, λ 2 = 0, x 1 = x, x 2 arbitrary in I .

• Fourth step
By Theorem 4 and by Corollary 16, we get that v * * is the unique viscosity solution of (25), and so the viscosity solution of (22), and so u = −V is convex in (0, K ).
By the Theorem 4, V can be extended continuously in [0, K ]. Thus, the concavity can be extended in [0, K ].
The proposition is completely proved.
Proof of Lemma 6 Theorem 5 guarantees that V is concave. Therefore, it is sufficient to prove that, for δ = μ, it does not exist α 1 , α 2 ∈ R, α 1 = 0, such that is a solution of (9) in I ⊆ [0, K ], for each interval I . Suppose that δ = μ and the value function V is as in (26). By substituting (26) in (9), we obtain that the following system must be satisfied: Since δ = μ, then the system (27) does not admit solution. Hence, we can conclude that condition δ = μ implies that V cannot be written as in (26).

Proof of Theorem 7
Let us fix ∈ R + . In order to prove the claim, it suffices to check that V is twice differentiable in the compact set I : The Eq. 9 is uniformly elliptic in I . Moreover, by the concavity and the continuity, thanks to Alexandrov's Theorem (see Fleming and Soner 1993, Appendix E, and just observe that in this particular case we have n = 1), we know that V is twice differentiable a.e. in I . Moreover, we also get that V ∈ L ∞ (I ).
We have V ∈ L ∞ (I ) ⇒ V ∈ L p (I ), p ∈ [1, +∞]. Thus, since I is bounded, for each p ∈ [1, +∞), we have Then we can write, a.e. in I , The right-hand side of (28) is the sum of functions which are in L p in the compact set I , and so we can state that V ∈ L p (I ), ∀ p ∈ [1, +∞].
Hence, we get that V is a function in the Sobolev space W 2, p (I ), for each p ∈ [1, +∞].

Proof of Theorem 8
First of all, we show a technical lemma.
Then we have where (t) is defined as where (α * , X * ) is an optimal couple associated to our control problem.
Proof • By definition of stochastic integral, this condition holds if and only if the stochastic process Y (s) := e −δs u (X (s))σ X (s) is squared integrable with respect to s, i.e.
Hence, in order to prove the validity of (29), we have to prove the validity of (31). By the twice differentiability of the viscosity solution of the HJB equation, we have that the first derivative of u is bounded. Therefore, there exists a positive scalar M u such that Moreover, the dynamics (i.e. the solution of the state equation) is a process in [0, K ].
So we get, for each ω ∈ , So (31) is true, and thus (29) holds. • Since the solution of (9) is bounded and by the monotonicity of the expected value operator we get: Setting a limit for t → +∞ for the three terms of (32), we obtain and (30) holds.
Proof of Theorem 8. We give the proof separating the cases.
By taking expectation, we get As in (a), we pass to the limit for t → +∞ under expected value operator, in virtue of Dominate Convergence Lebesgue's Theorem and Fatou's Lemma. Hence, by (29) and (30), we obtain J (x, α * ) = u(x).
and so we get that V (x) = u(x).

Proof of Proposition 9
The function H a is continuous with respect to the variable a ∈ [δ 1 , δ 2 ], and so Weierstrass' Theorem guarantees that it has an absolute maximum point in [δ 1 , δ 2 ].
A straightforward computation gives that the maximum point of H a is given by (14).

Proof of Theorem 10
First of all, we need a technical lemma to proceed.

Lemma 19
The closed loop Eq. 15 admits an unique solution.
Proof The proof of the result follows by (14) and by the existence and uniqueness of the solution of (1).

Proof of Theorem 15
The proof is a straightforward application of Theorem 8, starting from Proposition 9 and Lemma 19.

Proof of Proposition 11
Consider x, y ∈ [0, K ] such that x < y and a control α x ∈ A that is -suboptimal for x, i.e.
where τ x (τ y ) is the exit time associated to the starting point x(y) and the control α x .
For ω ∈ 122 we can writẽ for H δ > δ 2 . Assume now that ω ∈ 2 . Then we havẽ for H δ > δ 2 . By the monotonicity of the expected value operator, we have that V is an increasing function.

Proof of Proposition 14
By the Regularity Theorem 7, it is sufficient to prove that Let us consider the partition of defined in (37). Then, by the same arguments developed in the proof of Proposition 11, we get (38).