1. Introduction
1.2 Research objectives
A. Topic 1: Water quantity issues for reservoir inflow forecasting and urban flood
control
Accurate multi-step-ahead (MSA) forecasting is valuable and desired in many engineering problems, however it is a challenging task that is difficult to achieve. A common approach for improving the accuracy of MSA forecasting is to update network parameters through online learning techniques. Online learning is a supervised machine-learning framework, which adopts the latest information to adjust model parameters for a better mapping between instances and true values in an arbitrary system. Because most observational disciplines tend to infer the properties of an uncertain system from the analysis of time-dependent data, the analytical techniques for extracting the meaningful characteristics of time series data have certain inherent limitations, which have been widely discussed (Brockwell and Davis, 1987; Jaeger and Haas, 2004). Owing to the continual receipt of true values for adjusting model
parameters, online learning algorithms have several practical and theoretical advantages such as memory-efficient implementation, runtime-efficient implementation and strong guarantees on performance, even in a highly variable data structure of time series (Shalev-Shwartz et al., 2004). Nevertheless, the main defect of online learning can be attributed to the requirement for continual true values. Engineering problems often require models to predict many time steps into the future without the availability of measurements in the horizon of interest. The lack of true values makes it difficult to make MSA forecasts. In addition, many studies indicated that it is not an adequate strategy to recursively adopt single-step-ahead predictions for many time steps into the future because the errors of MSA predictors will be accumulated based on the single-step-ahead predictor (Parlos et al., 2000; Yong et al., 2010). Such time-lag problems may cause significant degradation in performance when dealing with MSA forecasting for real-world applications. For the MSA streamflow forecasting during typhoon events, models with time-lag problems (i.e., the latest observed values are unavailable) cannot keep flow trails, especially in peak flows, as the forecasting step increases. To mitigate time-lag phenomena that occur in online learning algorithms, it is argued that whether iterative adjustments of model parameters in consideration of additional information, such as the latest true values and/or antecedent model outputs, would be beneficial to MSA forecasting.
The real-time recurrent learning (RTRL) algorithm, proposed by Williams and Zipser (1989), is an effective and efficient online learning algorithm for training recurrent networks, in which real-time adjustments are made to the synaptic weights of recurrent networks. Several studies demonstrated that the RTRL algorithm for RNNs is very effective in modeling the dynamics of complex processes for providing accurate predictions (Chang et al., 2002; Chang et al., 2012; Hirasawa et al., 2000; Li et al., 2002).
The first main goal of topic 1 is to develop a reinforced RTRL algorithm for RNNs (R-RTRL NN) to mitigate time-lag effects for increasing the accuracy of MSA forecasting. The sequential formulation of the R-RTRL NN is derived, and its reliability and applicability are further demonstrated through two-step-ahead (2SA), four-step-ahead (4SA) and six-step-ahead (6SA) forecasting made for a famous benchmark chaotic time series and a reservoir inflow case in Taiwan. Comparative models consist of the original RTRL algorithm for RNNs (RTRL NN), the Elman neural network (Elman NN) (Elman, 1990; Liu and Wang, 2008; Liu et al., 2012) and the backpropagation neural network (BPNN, the most popular static ANN).
Urban flood control is a crucial and challenging task, particularly in developed cities. Urban floods are flashy in nature mainly due to severe thunderstorms and occur both on urbanized surfaces and in small urban creeks, which deliver mass water to cities.
On account of more impervious areas resulting from the rapid urbanization in metropolitan areas, less water infiltration has resulted in an increase in the flow rate and the amount of surface runoff over the last decades. Taiwan is located in the northwestern Pacific Ocean where subtropical air currents frequently introduce typhoons and convective rains. The urban flood hydrographs in Taiwan typically have large peak flows and fast-rising limbs in a matter of minutes, which could cause serious disasters.
For example, Typhoon Nari brought massive rainfalls at an astonishing level of 500 mm/day on September 17th in 2001, which resulted in 27 deaths, inundations at some stations of the Taipei Metro System, and countless economic losses. The heavy rainfall event on June 12th in 2012 brought astonishing rainfalls with a cumulative amount of 54.1 mm/hr, which directly resulted in quick and wide surface flooding such that the transportation system collapsed in most of the southern Taipei City. It appears that floods cannot be prevented, but planning emergency measures through flood management might mitigate disastrous consequences.
In response to the flood threats to residents and property, the Taipei City Government has long-term endeavored to develop flood control-related infrastructures, such as increasing levee heights and enhancing sewerage systems, and therefore urban inundations have been significantly mitigated and controlled in recent years. As a result, the main threat to the city turns out to be the floodwater inside the levee system. A
surface inundation will inevitably take place if surface runoff exceeds the capacity of a storm drainage system. To tackle this problem, pumping stations play an important role in flood mitigation at metropolitan areas and are principal hydraulic facilities built to manage internal stormwater flows at places under the condition that gravity drainage cannot be achieved. The operation of a pumping station highly depends on the water level information of its floodwater storage pond (FSP). Within the catchment of a pumping station, surface runoff will drain to its FSP for storage and subsequent disposal through gravity drainage. When the water level of the FSP reaches the start level of duty pumps, the pumps will be activated according to operation rules for discharging the stored floodwater into the nearby river of the pumping station. For floodwater control management during heavy rainfall or typhoon events, it is imperative to construct an efficient and accurate model to forecast many step-ahead FSP water levels by utilizing the information of the current FSP water level and the rainfall measured at the neighboring rainfall gauging stations of the pumping station. The proposed model is expected to provide sufficient response time for warming up the pumps in advance for enhancing secure pumping operations and urban flood control management.
The greatest success in flood forecasting is commonly achieved on large rivers.
Nevertheless, flash urban floods associated with heavy thunderstorms in cities are often very uncertain and are more difficult to predict due to complex dynamic phenomena
involved. Many studies demonstrated the predictability of streamflow through soft computation methods (Maity and Kumar, 2008) while only few papers investigated the prediction performance of inundation and/or sewerage systems in urban areas (Chiang et al., 2010). The second main goal of topic 1 intends to investigate the reliability and accuracy of short-term (10- to 60-minute) forecasting models for the FSP of a sewer-pumping system in Taipei City. Multi-step-ahead FSP water level forecasting models for flood pumping control during heavy rainfall and/or typhoon events are tailored made through a static ANN (the BPNN) and two dynamic ANNs (the Elman NN; the NARX network). Consequently, the comparison results of these three ANN models are evaluated to identify the effectiveness of recurrent connections. The forecasting system is designed to anticipate the occurrence of flooding and to take measures necessary to reduce flood-induced losses. The study will give a boost to the efforts for urban flood disaster management and will strengthen the Taipei City Government with more proactive disaster preparedness.
B. Topic 2: Water quality issues for the spatio-temporal estimation with respect to
the As concentration in groundwater and the TP concentration in a river basin
The second topic of this dissertation focuses on water quality issues for which the stabilization and variation of concentrations are important tasks for preserving healthy human and hydro-environmental systems.
As contamination in groundwater has been reported and resulted in a massive epidemic of As toxication in several countries such as Bangladesh, Vietnam, Cambodia, China and Taiwan. It is estimated that approximately 57 million people have drunk As-contaminated groundwater with concentrations exceeding the drinking water standard recommended by the WHO (World Health Organization) (BGS-DPHE, 2001;
Chakraborti et al., 2010). As pollution affects not only crop productivity and water quality but also the quality of water bodies, which threatens the health of animals and human beings by way of food chains. Long-term exposure to As through drinking water has been implicated in a variety of health concerns including cancers, cardiovascular diseases, diabetes and neurological effects (National Research Council, 1999).
Blackfoot disease and cancers of the skin, bladder, lung and liver have been associated with drinking As-contaminated groundwater (Chiou et al., 1997; Rahman, 1999).
As-contaminated groundwater is derived naturally from As-rich aquifer sediments, and the geochemistry of As can be rather complex (Stollenwerk, 2003). Various hydrogeological and biogeochemical factors affecting As concentration in groundwater have been detected, such as sediment mineralogy, microbial oxidation or reduction of As, groundwater recharge, groundwater flow paths (Ford et al., 2006; Wang et al., 2007
& 2011; Xie et al., 2012), and the presence of fractures in bedrock formations (Ayotte et al., 2003; Liao et al., 2011). Even though the processes controlling the release of As into
groundwater systems have been extensively discussed over the past decades, exact chemical conditions and reactions leading to As mobilization still remain a subject of intense debate (Goovaertset al., 2005; Polizzotto et al., 2006; Winkel et al., 2008).
Moreover, the high variability of As concentration can occur within a short distance and/or in different depths of groundwater wells due to the diversity in geology and geomorphology (Serre et al., 2003; Yu et al., 2003). Besides, the detection of As contamination in groundwater by using graphite atomic absorption spectrophotometry or inductively coupled plasma mass spectroscopy can be laborious and cost intensive.
Consequently, how to adequately estimate As concentrations in complex hydro-geological systems is a crucial and challenging task.
The hyper-endemic blackfoot disease in the Yun-Lin County of Taiwan has been verified to be associated with high As concentrations in groundwater (Chen et al., 1995;
Chiou et al., 1997). The residents have long-term exposed themselves to As through various paths such as the ingestion of aquacultural and agricultural products, and thus have dangerously posed carcinogenic risks to their health (Liu et al., 2008). Due to great concern for the potential effects of As on human health, there is a growing need for efficiently modeling the spatial distribution of As contamination in groundwater. One of the popular modeling approaches in use is the multiple linear regression (MLR), this approach, however, may fail to estimate the spatial distribution of As contamination due
to the great variability of As concentration and complex nonlinear processes involved in geology and geomorphology. Lately, using ANNs for the estimation of heavy metal concentration in groundwater has been attempted and gained a reasonably good degree of success (Chang et al., 2010; Cho, et al., 2011; Giri et al., 2011; Mondal et al., 2012;
Purkait et al., 2008). The modeling results indicated that ANN techniques can produce higher estimation accuracy than conventional methods such as MLR. These studies were mostly dedicated to exploring the applicability of static ANNs, such as the BPNN, for building the relationship between As concentration in groundwater and hydro-geological parameters in As-affected areas. Nevertheless, the natural characteristics of hydrogeological processes are not only complex but also dynamic.
The static neural networks might fail to establish reliable models for predicting the dynamical features, such that the delivered relationship might be simply the possible impacts of factors on temporal characteristics of local environments. Consequently, the comprehensive analysis of dynamic hydrogeological features and the estimation of As concentration variability over As-affected regions remains a great challenge that needs to be overcome.
The seasonal variation of steamflow in Taiwan is very high, where long-lasting low flows in drought seasons could dramatically increase the pollution levels in rivers.
Pollution in the downstreams of rivers raises a major environmental issue because many
industrial facilities and large populated cities are located along rivers. The water quality of the Dahan River in northern Taiwan has deteriorated rapidly due to heavy pollutant loads from surrounding urban areas. Considering the scattered watersheds over Taiwan and the high cost of field sampling, it is unlikely to obtain continuous water-quality time series data with complete properties at all sampling locations. Alternatively, the Water Quality Index (WQI) has been designed to assess the general conditions of water bodies in rivers, lakes or reservoirs. The WQI is sensitive to light pollution, and therefore it is a more suitable index adopted for water quality management. The WQI numerically summarizes the information of multiple water quality parameters into a single value, including dissolved oxygen (DO), coliform group, power of hydrogen (pH), biochemical oxygen demand (BOD), ammonia nitrogen (NH3-N), suspended solid (SS) and total phosphate (TP). Except for TP (measured quarterly), the other water quality parameters adopted in the WQI are measured monthly in Taiwan. Therefore, a monthly WQI incorporated with TP would be more comprehensive and more beneficial to short-term (monthly) water quality management.
TP, a combination of orthophosphate, polyphosphate and organic phosphate, is regarded as an index used in representation of the phosphorus quantity in river water.
Phosphorus is an essential element for all life forms (Correll, 1998). When phosphorus enters into a river, it is usually in the form of phosphate and can be transported from
upstream to downstream by flowing water. Excessive phosphorus is the most common cause of eutrophication in freshwater lakes, reservoirs, streams, and headwaters of estuarine systems. Orthophosphate chemicals are commonly used in agricultural fertilizers, and thus enter surface water easily during rainfall periods. Many studies reported that the phosphorus fertilizer form affects phosphorus loss to waterways (Azevedo et al., 2013; Davis and Koop, 2006). Polyphosphate is a primary chemical element added with considerable amount into detergents. Organic phosphates are basically formed by biochemical procedures associated with excrement, kitchen waste, water plants, etc. Phosphorus is one of the key elements essential for the growth of plants and animals. Nevertheless, the anthropogenic nutrient enrichment of natural water is of environmental importance as it can evoke declines in water quality, changes in biotic population structures, and low dissolved oxygen concentrations in rivers (Dodds et al., 2009; Austin et al., 1996). Excessive phosphorus has been shown to be a main cause of eutrophication, for example, naturally-occurring nutrients in large concentrations can often cause algae blooms (McDowell et al., 2010; Carpenter et al., 1998).
Water quality models are useful tools for estimating the levels and risks of chemical pollutants in a given water body (Duda, 1993). When building (or just applying) a water quality model, it is necessary to have long and sufficient field data to
validate model applicability and reliability. Water quality monitoring programs, however, are expensive and time-consuming. Modeling practices commonly face limited budgets and time, and thus suffer a deficiency of field data. Under this condition, the implemented water quality models might fail to fit known hypotheses and/or assumptions or cause difficulties in making estimations within an acceptable range of errors or uncertainty. With the development of model theory and the fast-updating computer techniques, many artificial intelligent techniques have been developed with various analytical algorithms to overcome data scarcity issues and simultaneously increase model reliability.
The NARX network (Lin et al., 1996), a sub-class of RNNs, is suitable to build long-term temporal input-output patterns (Menezes Jr. and Barreto, 2008). The NARX network has been demonstrated to perform well in several nonlinear systems, such as waste water treatment plants (Su and McAvoy, 1991; Su et al., 1992) and time series forecasting (Shen and Chang, 2013). However, the dynamic feature and feasibility of the recurrent connections in the NARX network as a nonlinear tool for water quality time series modeling under limited data sets has not been fully explored yet. Therefore, topic 2 will explore the practical meaning and importance of recurrent connections in
the NARX network when dealing with spatio-temporal water quality estimation problems.
In topic 2, a systematical dynamic-neural modeling (SDM) scheme incorporated with a dynamic neural network and advanced statistical methods is developed for building spatio-temporal estimation models for (1) As concentration at decommissioned wells based on the easily-measured water quality parameters at nearby functioning wells to offer an applicable and useful reference to decision makers for dealing with groundwater management and preventing residents from drinking or using toxic groundwater; and (2) TP concentrations at seven sites along the Dahan River in a quarterly scale based on easily-measured water quality parameters. In addition, TP concentration data are reconstructed in a monthly scale through a process that adopts the dynamical neural architecture of the constructed NARX network, and thus the reconstructed monthly data can be used to produce the monthly WQI for short-term hydro-environmental management.