Controlling a launch vehicle at exoatmospheric flight conditions via adaptive control allocation

The focus of this paper is the control of a reusable launch vehicle at exoatmospheric flight conditions, in the presence of actuator effectiveness uncertainty. Since during exoatmospheric flight, dynamic pressure is nonexistent, aerodynamic control surfaces cannot be used. Under these conditions, reaction control jet actuators can provide the necessary thrust to control the vehicle. Reaction control jets have only 2 states, namely, on and off, and continuous control inputs can be implemented with the help of pulse width modulation, which is also employed in this paper. A continuous controller is designed in the outer loop and a control allocator is used to distribute the total control input among redundant actuators, whose effectiveness are assumed to be unknown. The unknown actuator effectiveness is addressed with the help of an adaptive control allocator. A representative model of a reusable launch vehicle equipped with reaction control jets is used to demonstrate the effectiveness of the overall control scheme.


Introduction
At exoatmospheric conditions, the conventional aerodynamic control surfaces cannot be used since forces and moments cannot be generated in the absence of dynamic pressure. Reaction control jets (RCJ) and reaction wheels are generally the main actuators under exoatmospheric conditions [1]. In this paper, the focus is on the attitude control of a launch vehicle equipped with only RCJs. RCJs provide thrust in only 1 direction and have 2 states, namely on and off, which put them in the category of pulsed actuators.
A control system can treat RCJs as continuous actuators, with the help of pulse modulation, or they can be used with a bang-bang control approach, where the jets are fired on and off based on a phase-plane analysis [2]. Implementation of both of these approaches can be found in the open literature. Phase-plane based on-off control using RCJs is employed for the Apollo command module [3], Mars Science Laboratory [4], Space Shuttle [5] and Near Earth Asteroid Scout CubeSat [6], to name a few. To address the multiaxis coupling problem, which is challenging to handle using the phase-plane analysis, an alternative mixed-integer linear programming formulation is proposed in [7], where RJCs are blended together with aerodynamic control surfaces. Examples of pulse modulation based approaches utilizing RCJs can be found in [8][9][10][11].
Control allocation (CA) is a method used to distribute the desired total control effort produced by a controller to redundant actuators. There exist several different approaches to achieve this task. One way to allocate redundant actuators is to use the pseudo-inverse of the input matrix to produce individual actuator signals [12][13][14]. Another method is defining a cost function as a difference between the desired and achieved * Correspondence: yyildiz@bilkent.edu This work is licensed under a Creative Commons Attribution 4.0 International License. control signals and using optimization techniques to minize this function [15][16][17]. In general, a secondary function, such as radar signature or drag minimization is also achieved in these optimization based methods, by adding additional terms to the cost function. A survey on various types of CA can be found in [18].
In this study, a control allocation based control framework is proposed for launch vehicles equipped with RCJs controlled using pulse width modulation (PWM). What distinguishes this work from existing studies is that the RCJ control effectiveness is assumed to be unknown. The control effectiveness may decrease due to wear and tear, as well as the thruster gas pressure loss after long periods of use. To address this problem, a control structure inspired by a recently developed adaptive control allocator [19,21] is used.
To summarize, the contributions of this study are 2 folds: First, a control framework for RCJ equipped launch vehicles is proposed, where the RCJ dynamics are uncertain. Secondly, different from [19,21] the effectiveness of the adaptive control allocation algorithm is demonstrated in a setting where the actuators are controlled via PWM. In the simulation studies, it is shown that even when the actuators experience a dramatic loss of effectiveness, the proposed control framework is capable of providing a reasonable closed loop performance.
The organization of the paper is as follows. In Section 2, the necessary background to follow the technical developments in the paper is provided. In Section 3, the dynamics of the vehicle to be controlled is given. The overall control framework is presented in Section 4, where the controller, the control allocator and pulse width modulation is discussed. Simulation results are given in Section 5, where a comparison with a conventional optimal control allocator is provided. Finally, a summary and discussion of the paper is given in Section 6.

Preliminaries
In this section, the projection operator is introduced, following the description given in [20], which is used later in the technical development of this study.
Here, Θ i and Y i , i=1,...,m, refer to the i th columns of the matrices Θ and Y , and the projection applied on these columns are defined in (1).
Projection operator can also be implemented elementwise: For a ∈ ℜ and b ∈ ℜ , which may be considered, for example, as components of the vectors Θ i and Y i , the projection operator is defined as Defining ϵ ∈ ℜ + as the projection tolerance, and a min and a max as the lower and upper bounds of a , the convex function f (·) in (2) can be defined as The following 2 properties [20] of the projection algorithm makes it a useful tool to obtain a stable closed loop system in adaptive control applications.

Property 1 Given that a(0) ∈
It is noted that this property is used to guarantee the boundedness of the adaptation parameters independently from the stability of the overall system dynamics.
, where a ∈ ℜ and b ∈ ℜ are the components of the columns holds, where the trace operation is referred to as tr( · ).

Plant dynamics
In this study, a representative mathematical model for a reusable launch vehicle investigated in [7] is used to demonstrate the effectiveness of the proposed control framework. A brief description of the model is given in this section.

Consider the equation of motion
where M ∈ ℜ 3 is the net moment acting on the vehicle, J ∈ ℜ 3×3 is the inertia matrix, and ω ∈ ℜ 3 is the angular velocity vector, consisting of roll, p , pitch, q , and yaw, r , rates. Assuming small angles, it is obtained thatφ = p ,θ = q , andψ = r , where ϕ , θ , and ψ are the Euler angles. Defining the state vector as (5) can be represented in state-space form aṡ where In (7), I xx ∈ ℜ + , I yy ∈ ℜ + and I zz ∈ ℜ + are the vehicle moment of inertias calculated along the main axes.
The net moment, M , is created with the help of RCJs. Assuming that there exists n RCJs, and given a mapping matrix T ∈ ℜ 3×n , the net moment can be calculated as where u ∈ ℜ n represents the RCJs' output vector, each element of which can be treated as a real number between 0 and 1, with the help of PWM. Since it is assumed that the thrusters' effectivenesses are unknown, a diagonal matrix with positive elements, Λ ∈ ℜ n×n , is introduced to (8) as It is noted that Λ is unknown and represents uncertain actuator effectiveness.

Control system design
Overall closed loop system structure, including the launch vehicle, controller and the control allocator, together with the pulse-width-modulation (PWM) and the torque matrix, T , is shown in Figure 1. Below, the working principles of these individual components are explained.

Controller
The outer loop controller assumes a plant dynamics without any uncertainties, given in (6). The uncertainties are handled by the adaptive control allocator, which is explained in the following sections. For the outer loop controller, the controller proposed in [7] is used, which produces decoupled second order dynamics for the roll, It is noted that M d in (10) consists of the desired values of the moments M in (6). These desired values are realized by the control allocator, which is explained in the following section.

Control allocator
The goal of the control allocator is to receive the control signal, M d , and realize this signal using n different RCJs, in the presence of thruster effectiveness uncertainty, represented by Λ in (9). Inspired from [21], to achieve this goal, the problem is defined in the model reference adaptive control domain.
Assuming a stable, 3-by-3 matrix, A m , consider the dynamicṡ and a reference modelż Representing the parameter matrix to be determined as Θ ∈ ℜ 3×n , RCJ output vector, u, is created as Substituting (13) into (11), it is obtained thaṫ To continue the technical development, it is assumed that an ideal parameter matrix Θ * exists such that where I is the identity matrix with appropriate dimensions. Defining e z = z − z m andΘ = Θ − Θ * , wherẽ Θ can be considered as the deviation from the ideal parameter matrix, and subtracting (12) from (14), it is obtained thatė The adaptation law is obtained by conducting a Lypunov stability analysis, using a Lypunov function candidate where P is the solution of the Lyapunov equation A T m P + P A m = −Q. Here, Q can be taken as any positive definite symmetric matrix, which also makes the solution P symmetric and positive definite. Taking the derivative of (17) using (16), and the Lyapunov equation, it is obtained thaṫ Exploiting the property x T y = tr(yx T ) , where x ∈ ℜ n and y ∈ ℜ n , (18) can be rewritten aṡ Noting thatΘ =Θ, if the adaptation lawΘ is used, (19) can be rewritten aṡ Using Property 2, given in Section 2, it can be shown thatV ≤ 0. Then, with the help of Barbalat's lemma [22], it is obtained that lim t→∞ e z (t) = 0 , assuming that M d is bounded. Boundedness of M d can be obtained by using a soft saturation limit for the control signal.

Pulse width modulation
The control allocator determines the individual RCJ outputs (13) to realize the control signal (10), employing the adaptation rule (20). Since RCJs have only 2 discrete states, on and off, the continuous output vector, u , requested by the control allocator is achieved with the help of pulse-width-modulation (PWM) (see Figure 1).
To provide meaningful input signals to the PWM, the elements of u should be bounded in the interval [0, 1].
However, to facilitate the design of the adaptive law (20), a symmetric saturation limit, [-1,1] is defined for u . It is noted that this introduces additional uncertainty to the overall control system, which is expected to be handled by the adaptive control allocation.
Once the limits of u are set, the attainable moments can be obtained using the relationship (8). The limits of the attainable moment set are then set as the saturation limits for the controller output M d . Using these saturation limits and (13), the boundaries of the parameter matrix Θ elements can be calculated, which can be enforced using the projection algorithm.
The PWM used in the tests has a cycle time of 80 ms. The simulation step time is set to be 8 ms.

Simulation results
During the simulations, a continuous reference vector [ϕ d , θ d , ψ d ] is provided to the closed loop system. The controller produces the desired moment vector M d . The control allocator then produces the necessary actuator input signal vector, u, which is realized by PWM. The moment vector created by RCJs on the vehicle is then calculated via (9), using the mapping matrix T , given as [7] T = It is noted that each entry of the matrix T represents the moment produced by the corresponding RCJ.
To introduce actuator uncertainty to the moment calculation, the actuator effectivenesses are reduced to 30 % of their full capacity, at t = 20 s, employing the actuator effectiveness matrix Λ .
The adaptive control allocation solution is compared with a conventional optimal control allocation method in the following sections. Before the comparison results are given, the parameter initialization process for both control allocators are explained below.

Initialization of the control allocation parameters
The adaptive control allocator's parameter matrix is updated online using the adaptive law given in (20).
The initial conditions for this parameter matrix can be selected as zero, if one does not prefer to use any prior knowledge about the plant. In the simulation studies conducted in this paper, the initial conditions for this matrix is calculated using (15), where the uncertainty matrix Λ is taken to be an identity matrix, since fault/uncertainty identification is not done in this work. DefiningΘ ≡ Θ(0), this initial condition selection creates a control allocation output that is equivalent to where the first term is a fixed actuator signal and the second term acts as an adaptive augmentation whose adaptive parameter matrix elements are initialized to zero.
The initial parameter values obtained by setting Λ = I , where I is an identity matrix, and solving (15), are also the ideal values for the optimal control allocation, which is explained in the next section. Therefore the same initial values are used for the optimal control allocator.

Comparison with an optimal control allocation method
In optimal control allocation, the objective function is minimized while ensuring 0 ≤ u ≤ 1. The advantage of using adaptive control allocator becomes more apparent when the actuator signals are compared. As seen in Figure 3, the actuator signals produced by the optimal control allocator have high frequency oscillations, while the ones produced by the adaptive control allocator show reasonable switching times.
Finally, the evolution of all vehicle states, p, ϕ, q, θ, r and ψ , are presented in Figure 4. Even though a large uncertainty is introduced at t = 20 s, which is marked on the figure, all the states remain bounded within reasonable bounds and converge to constant values within a reasonable time interval.

Summary and discussion
In this paper, a control allocation solution is proposed for the control of launch vehicles in exoatmospheric conditions. There exist 2 main challenges for this task. The first challenge is controlling the system in the presence of actuator uncertainty. The second challenge is the specific characteristics of the reaction control jet (RCJ) actuators, which provide only on and off modes of operation, either providing a constant amount  of trust or not providing any input. The first challenge is addressed by the adaptation capability of the investigated control allocation algorithm. A common approach to remedy the second challenge is using pulsewidth modulation (PWM), which enables the designer to treat the control signals as continuous duty-cycle commands. To achieve this, the control allocation output must be constrained properly. In the presented control allocation method, this is achieved using the projection algorithm. It is demonstrated that the exploited adaptive control allocation method performs better than a conventional optimal control allocator in the presence of actuator uncertainty.
Although the simulation results show the advantage of using an adaptive approach compared to a nonadaptive method, there is a large room for improvement for the presented adaptive control allocation method. For example, instead of using a symmetric projection boundary, an asymmetric boundary can be used to eliminate the additional uncertainty introduced to due negative control allocation outputs. Another improvement would be designing the control allocator for the RCJs' 2-state output (on/off ), instead of using an approximate method such as PWM. This paper presents a first attempt at attacking the problem of controlling a vehicle with RCJs using adaptive control allocation, and the discussed improvements will be considered in future work.