Pattern identification of biomedical images with time series: Contrasting THz pulse imaging with DCE-MRIs.

Objective: We provide a survey of recent advances in biomedical image analysis and classiﬁcation from emergent imaging modalities such as terahertz (THz) pulse imaging (TPI) and dynamic contrast-enhanced magnetic resonance images (DCE-MRIs) and identiﬁcation of their underlining commonalities. Methods: Both time and frequency domain signal pre-processing techniques are considered: noise removal, spectral analysis, principal component analysis (PCA) and wavelet transforms. Feature extrac- tion and classiﬁcation methods based on feature vectors using the above processing techniques are reviewed. A tensorial signal processing de-noising framework suitable for spatiotemporal association between features in MRI is also discussed. Validation: Examples where the proposed methodologies have been successful in classifying TPIs and DCE-MRIs are discussed. Results: Identifying commonalities in the structure of such heterogeneous datasets potentially leads to a uniﬁed multi-channel signal processing framework for biomedical image analysis. Conclusion: The proposed complex valued classiﬁcation methodology enables fusion of entire datasets from a sequence of spatial images taken at different time stamps; this is of interest from the view-point of inferring disease proliferation. The approach is also of interest for other emergent multi-channel biomedical imaging modalities and of relevance across the biomedical signal processing community.


Introduction
The aim of this study is to provide a brief account of recent advances in time series analysis and imaging, in order to identify the commonalities between terahertz (THz) pulse imaging (TPI) and dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) modalities as both approaches are complementary in their ability to identify and assess disease proliferation. A further aim of this work is to suggest some future directions for machine learning approaches that potentially lead to the automation of diagnostic processes. To clarify these concepts, the paper is structured as discusses generic signal de-noising methodologies, applicable to both systems and feature extraction methods using linear transforms. These discussions effectively focus on robust feature extraction from a single pixel perspective. We also discuss recent advances in different classifier methodologies, with an emphasis in complex ELM approaches. The methodology may be naturally extended to multi-pixel or voxel images. In addition to supervised learning, the current clustering techniques for segmenting THz images are briefly reviewed. In Section 4, we present current DCE-MRI image analysis algorithms. A brief discussion of algorithms relevant to both imaging modalities is provided in the context of future developments in tensorially based feature extraction. Extensions to multi-channel classifiers are considered, as these are necessary to account for specific geometrical features observed across an image and enable data fusion from heterogeneous sources. In addition, such multi-channel approaches enable the fusion of information acquired from multiple images at different time stamps, thus potentially elucidating disease proliferation. This section also discusses image registration issues. Section 5 provides some concluding remarks. The work aims to highlight progress towards a generic framework for the automated quantitative assessment of disease proliferation using both sensing modalities.

Concise introduction to fundamental physicochemical processes associated with interactions of waves with matter across the THz region of the EM spectrum
Investigations at the THz or T-ray part of the electromagnetic (EM) spectrum loosely defined between 100 GHz and 10 THz are of much relevance to biomedical science, because complementary information to traditional spectroscopic measurements on low-frequency bond vibrations, hydrogen bond stretching and torsions in liquids and gases may be obtained (Fig. 1). The vibrational spectral characteristics of bio-molecules, which lie in this range (wavenumbers between 3.3 and 333 cm −1 ) make T-ray imaging systems a promising sensing modality for clinical diagnosis. Since THz photons have significantly lower energies (e.g., only 1.24 meV at 300 GHz) than X-rays, they have been considered by many as non-invasive. Although non-linear interactions between biological tissue and coherent THz radiation have been predicted by Fröhlich [1] and experimentally verified by the careful work of Grundler and the analysis of Kaiser [2] in the 90s, the widely held view at the moment is that any measurement technique that operates at THz frequencies should be evaluated using current guidelines on specific absorption rates; these are only associated with the thermal effects of the radiation with the tissue so from a clinical perspective can be considered as non-invasive. Such a view is also further supported by noting that, the Gibbs free energy conveyed in the THz light beam is insufficient to directly drive chemical reactions. For example, the molar energy at a frequency f of 100 GHz would be given from E = Nhf where N = 6.023 × 10 23 mol −1 , (Avogadro's number), and h = 6.626 × 10 −34 Js (Planck's constant), resulting in a calculated value of only E = 0.04 kJ mol −1 which is so low (approximately 100 times lower than the amount of molar energy required for ATP hydrolysis) that for most practical purposes we may assume that the interference with biochemical processes would be minimal.
Further advantages of performing imaging based on the optical properties of biological tissue with THz radiation are the improved penetration depth within the tissue and the comparatively lower scattering than infrared light. Organ differentiation on the basis of tissue water content using microwave transmission or reflection measurements is often impractical because the diffraction limited minimum spot size related to a free-space beam is rather large, and as a consequence there is significant beam spill-over around most tissues and organs. This has limited the further proliferation of microwave imaging techniques to the biomedical field.
From a technological point of view, THz imaging is an emergent complementary imaging modality of much interest within the biomedical community. Its proliferation has been somehow delayed because it often needs to compete with positron emission tomography (PET) imaging that is capable of picomolar sensitivity but has poor spatial resolution and magnetic resonance imaging (MRI), which provides millimolar sensitivity with high spatial resolution. A diffraction limited imaging system operating at 2 THz would have a spatial resolution of 150 m, which may be considered limiting for many biomedical applications for which this imaging modality offers niche applications (e.g. differential imaging of cancer cells in breast tissue of pregnant or lactating women). From a clinical perspective, tumours need to be identified at the earliest possible developmental stage and unless suitable THz super-resolution techniques can be developed, it is unlikely current systems will be adapted by clinicians. If, however, imaging systems with sufficient signal-to-noise ratio (SNR) could be developed to operate at 10 THz, these would in principle have an attainable resolution of 30 m, which potentially have a resolving power just about sufficient for such applications. An additional complication of this imaging modality is that since 70% of the human body is composed of water, a large proportion of the excitation energy would be significantly attenuated. As a consequence, the resultant spectra in many biomedical experiments may only be unambiguously resolved after the application of elaborate post-processing techniques.

Introduction to DCE-MRIs
The principle of MRI was discovered in the late 1940s, through pioneering experiments by Sir Peter Mansfield while he was a student at Queen Mary College, University of London under the supervision of D.H. Martin and independently by Paul Lauterbur while he was at the University of Pittsburgh, the Mellon Institute of Industrial Research and subsequently the State University of New York at Stony Brook after following the work of Robert Gabillard on Nuclear Magnetic Resonance (NMR). It was only in 1980, however, that MRI was introduced as a new diagnostic modality and it was much later when it gained wider acceptance as an imaging tool for a diverse range of biomedical disciplines such as neurology, oncology, obstetrics and gynaecology. In a clinical setting, it is now routinely used in cardiovascular, musculoskeletal, gastrointestinal and liver imaging as well as in neuroimaging, providing information on their physiological status and pathologies. In this respect it also complements NMR.
MRI de-embeds structural details of the various organs on the basis of the observation of hydrogen atom (proton) de-excitation rates as they arise following an excitation by an oscillatory radio signal within a magnetic field. The energy emission stemming from the spinning hydrogen molecules is determined by two time constants, T 1 and T 2 and the contrast between different tissues is determined by the rate at which excited atoms return to the equilibrium state. The radio signals can be made to encode position information by varying the main magnetic field using gradient coils, enabling the formation of images. Simple T 1 and T 2 weighted images are very common in many medical applications. An important advantage that MRI has over its computed axial tomography (CAT) counterpart is that it avoids the deleterious effects associated with X-ray irradiation.
A significant advance in MRI technology for the non-invasive assessment of disease proliferation has been witnessed through the use of exogenous contrast agents, the so-called DCE-MRIs. Note that DCE-MRI enables the detection of tumour anomalies with high sensitivity [3]. The crucial advantage of DCE-MRIs over standard MRIs is that the DCE-MRI modality provides 3D spatial lesion information as well as temporal information regarding the progression of lesions. Furthermore it provides information by showing variations in contrast agent uptake rates and enables a more accurate assessment of the extent of lesions and new opportunities for their better characterisation [4,5]. From a classification perspective, DCE-MRI produces a sequence of three-dimensional (3D) patterns recorded at different time instances. These datasets are therefore four-dimensional (4D), with three spatial dimensions and a quantization in the time domain defined by the image interval lapse time. The detection of anomalies in spatiotemporal datasets is an emergent interdisciplinary topic that requires the development of completely new software tools [6]. Furthermore, as discussed in [7][8][9][10], analysing spatiotemporal patterns is critical for the correct identification of tumour anomalies in DCE-MRIs, establishing whether they are malignant or benign. Finally, it is worth noting that currently clinical MRI stands at the cross-roads between some important technological developments. These include the recent advances in higher field (3 Tesla) imaging systems as well as the more experimental low field (milli-Tesla) systems which have cryogenically cooled phased array detectors. Detail in the images produced by these new experimental modalities differs from that found in the traditional 1.5 T systems, requiring additional time by trained experts for their interpretation. Such problems can compromise their further proliferation unless automatic image classification algorithms can be developed.
Progress in machine learning offers new opportunities for automated biomedical data mining and disease diagnosis as it can potentially simplify the process of searching through large volumes of medical images for features selection and identification, as well as tumour screening. In addition, it can also potentially provide associations and correlations through time series analysis of datasets, thus elucidating disease proliferation [11][12][13][14][15][16]. Image processing techniques can be used to extract quantitative information on lesion morphology, volume and kinetics, as well as to distinguish viable from nonviable tissue [4,14]. Techniques for processing large volumes of medical image datasets with high dimensionality are not sufficiently mature, however. Experts often distinguish tissue states on the basis of tumour information from radiological reports by characterizing lesions either as malignant or as benign [4,[17][18][19][20]. Such practice is subjective, relying on the availability of experts and is occasionally inaccurate, as it involves a binarization step of the classified output without taking into consideration intermediate values as a fuzzy-set expert system would do (it also incorporates a slight bias towards false-positives to avoid mistaken diagnoses as this could potentially tarnish an expert's reputation). A further disadvantage of this approach is that it can be slow and therefore costly, inducing additional pressure loads in otherwise already overloaded health systems worldwide. In addition, it is also worth noting that alternative, tumour detection methodologies based on analysing a series of two-dimensional texture features as local descriptors using a sliding window, have a major shortcoming in that they are unable to take into consideration more complex morphological features of tumour anomalies across the entire tissue volume [14,21,10].
Furthermore, there are other pressing needs for the further development and wider proliferation of multi-channel machine learning algorithms for the biomedical community. In DCE-MRI, for example, motion correction software often needs to be employed [22]. Correlations with breathing, provided from an additional channel source (e.g. an optical fibre based plethysmograph) may potentially assist pixel de-blurring at a post-processing stage, minimizing artifacts [23]. Hyperspectral imaging is also another emergent modality in biomedical screening (e.g for skin cancer detection), requiring such multi-channel signal processing. From a SNR image quality perspective, for each pixel, an improvement is only possible by using a stronger excitation source per spectral bin, by improving on the detector responsivity and noise floor, and by integrating over a longer time at each pixel or voxel. Equipment manufacturers argue that there is really no substitute for increased integration time to perform the de-noising process. Broadband signals, however, enjoy the Fellgett multiplex advantage, which makes use of the broadband nature of a short pulse measurement modality to provide superior clarity per pixel for a set measurement integration time when compared to successively scanned frequency domain investigations using continuous-wave source measurement modalities. It is not uncommon, therefore, that in order to speed-up measurement time in many MRI applications, different software approaches are considered by limiting the fidelity (number of pixels or voxels) or reducing the number of tomographic projections. Such practice, however, can lead to image feature artifacts if sampling is not performed following Nyquist's criterion in the spatial domain [24,25]. A pre-processing step that can reduce the dimensionality of the data-set by compressing it to more parsimonious representations without distorting it can potentially minimise the dimensionality of the input dataset presented to a classifier. This can help in improving its classification accuracy and generalization ability. Furthermore, an additional advantage from compression is that it is advantageous from a signal-to-noise ratio perspective as the co-averaged signal components in each spectral bin are always in-phase whereas the noise components are out-ofphase and thus suppressed. Image reconstruction and registration are further discussed in more detail in Section 4 of this paper.

Biomedical imaging techniques and modalities for THz radiation and DCE-MRIs
The following section provides an in-depth discusion on the design, clinical applications and research potential of two emergent imaging modalities: THz pulsed imaging and DCE-MRIs. Emphasis is on technological aspects associated with the associated hardware but also extends to the science of image formation along with the properties and acquisition parameters, taking into consideration the underlining physics associated with these systems.

Time-dependent techniques using THz radiation
Although much of the pioneering work in building interferometric spectro-radiometers and other continuous wave measurement systems at the THz part of the spectrum took place at Queen Mary College over a period of almost 30 years under the guidance of D. H. Martin [26], it was only during the past two decades that THz science and technology has literally flourished as a universally accepted new sensing modality. This has been largely due to the advent of time domain spectroscopy (TDS) with ultrashort-pulse laser sources. These systems enable time-resolved 'far-infrared' (FIR) studies for biomarker identification, which through explorative spectroscopic investigations may eventually lead to novel imaging applications in the submillimeter part of the spectrum. T-rays have significant potential in advancing both in in vivo and in vitro biosensing applications [27][28][29] in addition to their relatively non-invasive interaction with biological tissue. Furthermore, a significant number of biomolecules have several characteristic 'fingerprint' resonances due to discrete molecular vibrational, torsional and librational modes, both in liquids and solids [30,31].
Notably, THz-TDS is a time-domain technique similar to the well-known pulsed radar sensing modality, where the time gated reflections are analysed directly in the time domain by observing their attenuation, phase delay and temporal spread after interacting with matter. Their temporally good definition can provide localization of tissue interfaces on the basis of their different refractive index. The real part of of the refractive index, as a function of frequency, is associated with phase spectra and the complex part is associated with the attenuation hence amplitude spectra. Studies in reflection geometry also enable the indirect assessment of sample or layer thickness, as well as in determining the position of embedded unknown objects, etc. [27]. Pulsed THz wave technology has also been applied extensively in biosensing. The pioneering investigations in biomolecule characterization were performed by the Aachen group [32] and Jepsen's group [33]. These were followed by a rapid growth of investigations by other researchers worldwide [34][35][36]. An interesting example is that of an affinity biosensor monitoring the binding between biotin and avidin molecules on supported membranes composed of biotin layers on a quartz surfaces treated with octadecanol, as proposed by Menikh et al [37].
In that work, an amplified detection of biotin-avidin binding was very clearly observed through the dithering of the samples in a THz beam. This was attributed to the conjugation of agarose particles and avidin molecules and a change in contrast due to a change of refractive index resulting from the chemical binding process [30,38]. As avidin has a very strong affinity for biotin and is capable of being bound to any biotin-containing molecules, the developed detection technique is quite generic and can potentially be used to detect DNA hybridization and antigen-antibody interactions [30]. Fischer et al. [39] was able to reliably distinguish between two artificial RNA single strands, composed of polyadenylic acid (poly-A) and polycytidylic acid (poly-C) from their different THz spectral transmission responses. In that study poly-C samples consistently showed stronger signal attenuation than poly-A samples.

Frequency dependent THz spectroscopy
In the THz part of the spectrum, many molecules have characteristic 'fingerprint' absorption spectra [27]. Substances in the condensed phase are held together by either ionic, covalent or electrostatic forces, and therefore the lowest frequency modes will be associated with intermolecular motion [40]. The interaction between THz radiation and biological molecules, cells, and tissues can be understood using assumptions of propagation of an angular spectrum of plane waves through the material [41]. Following standard postulates of dielectric theory, a medium may be characterized in terms of its permittivity ε (the ability of the medium to be polarized) and conductivity (the ability of ions to move through the medium). At higher frequencies, transitions between different molecular vibrational and rotational energy levels become increasingly dominant and are more readily understood using a quantum-mechanical framework [42]. THz pulse spectroscopy provides information on low-frequency intermolecular vibrational modes [43].

THz radiation absorption and detection in tissue
THz radiation interacts strongly with polar molecules, a prime example being water [44]. Polar water molecules are active in the infrared region and have various vibrational modes [45]. In the midto far-infrared, the vibrations involve combinations of the symmetric stretch ( 1), asymmetric stretch ( 3), and bending ( 2) of the covalent bonds. The vibrations of water molecules may be thought of as restricted rotations, resulting in a rocking motion, as shown in Fig. 2(a). In liquid water, since hydrogen bonds are much weaker than the covalent bonds (intra-molecular), their bond lengths are much longer (1.97Å versus 0.96Å), as shown in Fig. 2(b). Steric effects from dipole moments in water clusters vary according to hydrogen proximity and as a consequence, shifts in ro-vibrational modes at THz frequencies are encountered. Furthermore, these shifts are expected to be loosely correlated with different water potential values, which indirectly affect the ability to interact with the surrounding molecules. This has further important ramification on the way proteins influence the state of water and can lead to further understanding of the function of hydration shells in proteins [46,47] THz-TDS provides a direct measure of the real and imaginary components of the permittivity. A Debye relaxation model can be used to analyze the strong absorption of THz radiation in polar liquids at least up to 1 THz [48,31]. This model can be directly related to the associated intermolecular dynamics.
Biological tissue is generally composed of polar liquids. Due to the exceptionally high absorption losses of polar liquids at THz frequencies and the low source power in TPI systems, it is impossible for the THz radiation to penetrate through biological tissue of any substantial thickness. However, the same high absorption coefficient that limits penetration in tissue also provides extreme contrast between samples at various degrees of water saturation [31]. This property has proven advantageous in the examination of the properties of water uptake and distribution in plants [49,50], as well as in the evaluation of the severity of burns through the evaluations of necrotic skin samples [28]. In addition, [51] and [52] describe the application of TPI techniques for imaging of basal cell carcinomas (BCC) ex vivo and in vivo. Note that BCCs typically show an increase in absorption of THz radiation compared to normal tissue. This may be attributed to either an increase in interstitial water within the diseased tissue [53] or a change in the vibrational modes of water molecules through interactions with other functional groups. Systematic studies in tissue identification are reviewed in [31].

Identification of compounds with complex composition
The identification of pure compounds using molecular signatures with THz-TDS systems is still not straightforward because of the inherently broad spectral signatures in liquids and solids. Nevertheless, there is a growing number of multiply confirmed observations of particular resonant signatures that may be attributed to the presence of many compounds in pure form [54]. Of particular relevance here is the growing interest in studying the conformational structure, binding states, and vibrational or torsional modes of proteins and oligonucleotides [55,56] through the analysis of spectral features [57]. Different reflection or absorption signatures may also be attributed to a change of density or polarizability, or may indicate a dehydration state, or a denaturing process leading to a new amorphous absorption band or a temperature related absorption band shift. A compilation of readily identifiable spectral signatures of complex biomolecules in an atlas has already been considered at Durham University, and significant progress is being made, yet there is recognition by the THz community that it is unlikely it will have the universal applicability associated with other databases such as HITRAN, because of the variability in the location of the spectral bands observed under different experimental protocols.
There are still, however, several examples of useful studies that may be found in the recent literature. Nishizawa et al. [58] illustrated the use of a widely tunable coherent THz scanning system for THz transmission spectroscopy to study samples consisting of nucleobases, nucleosides, deoxynucleosides, and nucleotides, to further gain an insight of the composition of RNA and DNA molecules. The THz spectra of those samples in crystalline form were measured in the 0.4-5.8 THz range. These studies showed that the molecules have quite different characteristic spectral patterns in this frequency region, furthermore the absorption signature patterns observed were sufficiently clear and reproducible for identifying and discriminating between these molecules. Using pulsed THz spectroscopy [55] it has also been possible to study the low frequency collective vibrational modes of bio-molecules, i.e. DNA, Bovine Serum Albumin and Collagen in the range 0.1 and 2.0 THz. It is generally accepted that for most samples, broadband absorption increases with frequency and a large number of the low frequency collective modes for these systems is also deemed as IR active. Herrmann et al. [83] also carried out measurements of THz spectra of Poly(dA-dT)-Poly(dT-dA) DNA and Poly(dG)-Poly(dC) DNA and used new signal processing routines to infer the THz complex refractive index. The resultant spectral features showed that those samples were indeed distinguishable in the range 0.1 to 2.4 THz. Several research groups in Germany and Australia, have also studied the photo-isomerization of retinal chromophores [59,60] focusing on the conjugated polyene chain of the biologically important chromophore retinal and its lowfrequency torsional vibration modes. In that work, the absorption and dispersion spectra of different retinal isomers (all-trans; 13cis; and 9-cis retinal) in the far-infrared region between 10 and 100 cm −1 (0.3 ± 3.0 THz) were measured by THz-TDS at 298 and 10 K. At low temperatures, it was observed that the broad absorption bands resolve into narrow peaks that directly correlated to torsional modes of the molecule. The study also confirmed that vibrational modes within the molecule can be approximately localized through a comparison of the absorption spectra of different retinal isomers. An alternative important research direction vigorously pursued by Teraview Ltd., Cambridge, U.K., aims to put an end on patent infringements within the pharmaceutical industry by detecting the presence of drug polymorphs [43,61,62]. Such studies, for example, have successfully used TPI to examine the variation in the crystalline structure of Ranitidine Hydrochloride polymorphs. Significant differences in the spectra of two different polymorphs were clearly observed at around 1.10 THz enabling their correct identification. A recent account on advances in the identification of crystalline structure of drugs using TPI is provided in [63]. Furthermore, the observation of the crystallization of compounds has also been possible [64].

Time-frequency dependent THz spectroscopy
Time-frequency analysis methods have been developed to provide very parsimonious parametrizations of time series datasets and in this sense complement well other parametrization schemes performed in either time or frequency domains [65,66]. The wavelet transform (WT) is a popular technique suited to the analysis of short-duration signals [67]. It decomposes the time series signal using two filter banks separating the high (detail) and low (approximation) frequency components of the signal assuming a pre-defined mother wavelet function. The approach provides very efficient de-noising capabilities in the presence of Gaussian white noise and has very parsimonious representation. An important feature of this transform is that it has orthogonal basis functions so that it enjoys perfect reconstruction symmetry, which enables its inverse transform to reproduce the original dataset without loss of information. This is a particularly important property from a biomedical signal processing perspective as software certification for biomedical purposes should require complete traceability of all the data processing steps. A further development in the biomedical signal processing literature has been the use of adaptive wavelets [68], where the mother wavelet is specifically tailored at each decomposition level (wavelet scale) accordingly, to minimize the least squares error associated with the difference between the transformed signal from its original one. This approach holds great promise for optimizing the extraction of the spectroscopic information contained in each THz pulse transient as well as in THz TPI generally [69][70][71][72]. Fig. 3 presents the result of this simulated test, in terms of classification errors. To generate this graph, the standard deviation of the noise was varied from 0.001 to 0.5. For each noise level, 250 noisy patterns were generated for each class (lycra and leather). As can be seen, the classification is much more robust to noise when carried out in the wavelet domain than in the original domain. Moreover, the robustness to noise is further increased by the optimization of the WT.
In addition to the above more elaborate routines, there have also been other examples of studies that incorporate WT pre-processing routines for signal-to-noise-ratio enhancement and classification of THz spectra [28,73]. Such pre-processing steps enabled the successful discrimination of cancerous from normal tissue in waxembedded histopathological melanoma sections as well as the classification of dentine and enamel regions in teeth [69]. It is nowadays generally accepted that the performance of a classifier based on the output of a wavelet filter bank is improved over that of an Euclidean distance classifier in the original spectral domain [72]. Finally, an alternative very promising approach for the modelling of de-excitation dynamics, which has its origins to the theory of complex dielectrics, is through the use of fractional order calculus and the fitting of fractional order models. In this approach, the time series experimental datasets are modelled using very parsimonious pole-zero expressions associated to dynamics of resistive, capacitive or inductive networks [74][75][76]. Although the fractional-order system identification literature is still at its infancy, it promises to provide much lower residual errors in the identified models thus significantly advancing the science of chemometrics. These algorithms, however, have yet to be adopted by the biomedical signal processing community. The approach can account for spectral shifts in amorphous materials as well as de-embed solvation dynamics.

A brief introduction to experimental set-ups for THz trans-illumination
THz imaging can be remarkably informative regarding a sample's composition. The Fourier transform (FT) of the associated time domain waveform over a certain spectral range allows the calculation of the frequency dependent refractive index and absorption coefficients of the sample. If the FT is performed in real time using dedicated hardware, it becomes possible to extract the above parameters also in real time. Since wavelengths are longer in the THz part of the spectrum, there is sufficient phase stability in the experimental apparatus, this enables the extraction of phase information by varying the time delay between the THz wave and the probe beam [27]. Since some materials are transparent to THz radiation, it is occasionally feasible to measure transmission responses and acquire spectral information. Reflectance imaging is also straightforward, and through their combination, a spectral absorbance may be inferred. This is not always possible at the infrared and optical parts of the spectrum where errors due to scattering of shorter waves due to the surface roughness of the samples preclude direct calculations of absorbance. Elementary signal analysis may also be used (e.g., differential absorption) to produce informative contrast images that can be invaluable to the evaluation of disease proliferation.
Using continuous wave systems, there is a variety of instruments that may be assembled using quasi-optical active and passive components. The AB Millimetre vector network analyser, if available, is the preferred choice for continuous wave measurements with significant signal-to-noise ratio per spectral bin all the way up to 1.2 THz although it is not as user friendly for extracting scattering parameters as other commercially available solutions that operate at much lower frequencies. An account of different topologies using null-balance methods can be found in [77] whereas polarimetric measurements for dichroic samples should be ideally performed using the topologies discussed in [78,79] or Fabry-Perot structures, e.g. [80]. Alternative broadband experimental configurations may include Mach-Zehnder or Martin-Puplett configurations as discussed in [50]. When high power per spectral bin is needed, THz imaging may also be performed with high-power THz sources under pulsed scanning mode and pulse-gated detection using large scale facilities (e.g. Jefferson lab, FELIX etc) but at significant cost. Currently, however, biomedical investigations with such facilities lag behind investigations performed by the semiconductor community.
An important advantage of time-domain systems over their continuous wave counterparts that are plagued by etalon effects is that of being able to perform time gating of the pulses. This is possible as long as the multiple reflections in the measurement system are sufficiently far away so as not to be mixed with the molecular de-excitation signals of the sample. The typical time-resolved THz spectrometer used in most of the studies discussed so far utilize a short coherence length infrared source (centered at around 800 nm) to generate a sub-100 femtosecond duration pulse train with repetition frequency of around 80 MHz. Each infrared pulse, is split into separate pump and probe beams. The pump beam is used to excite an optical rectification crystal, which acts as a T-ray emitter, and the T-rays produced (duration around 200 fs) are collimated and focused onto a sample by a pair of parabolic mirrors. The T-rays emerging from the sample are re-collimated by another pair of mirrors, before being combined with the probe beam in a T-ray detector crystal. As a result, the modification by the sample T-ray and the probe beams propagates through the THz detector crystal co-linearly. The pump beam, which is also transmitted through a chopper, travels through an optical delay stage that is modulated accordingly, so that the pump and probe beams arrive at the detector in a time-coincident manner. The electro-optic detector crystal produces an output that is proportional to the birefringence observed from the interaction of the THz pulse with the timecoincident infrared pulse replica within the crystal. This output is proportional to the T-ray response of the sample and this signal is measured with the use of a balanced optical photo-detection scheme. A lock-in amplifier (LIA) is also used to demodulate the signal, and this avoids 1/f (flicker) noise problems that are present in this detector-limited measurement scheme. Typically, THz-TPI is performed through a 2D raster scan after translating the sample in both the x and y direction, while keeping it at the focal plane of the parabolic mirrors. A typical setup [81] is shown in Fig. 4.
It is also worth noting that in all of the above experimental set-ups one needs to always consider that there may also be additional pseudocoherence errors because different parts of the beam across its aperture travel different paths through different regions of the sample (if this is of non-uniform thickness), interfering constructively or destructively with each other when they recombine. A recent account of advances in THz metrology discussing errors in both continuous wave as well as THz-transient systems can be found in [82]. Such errors are endemic to much of the THz literature although this is not extensively discussed. Management of these artefacts and their relevance to imaging applications is therefore an open issue requiring further consideration.
The resultant measurement at each pixel position of an image corresponds to an entire waveform in the time domain. Therefore, the result from TDS-TPI is a three-dimensional (3D) data set, which then can be potentially mapped to two-dimensional (2D) images [83], where structural and compositional discrimination based on a sample's optical properties may be conveniently performed using pattern recognition algorithms. In the following section, sample responses from multiple THz spectrometry experiments are discussed within the context of a pattern recognition framework. This paves the way for the extension of pattern recognition algorithms to the above emergent sensing modalities [84].

DCE-MRI imaging techniques and modalities
As stated earlier, MRI provides new opportunities for the classification, grading, and diagnosis of tumours, enabling both surgical planning and clinical management on the basis of tumour morphology. It further provides clear differentiation of healthy organs and anatomical structures so that an association of lesions with neighbouring organ structures can be inferred. It is, therefore, widely used in cancer diagnosis elucidating tumour response to treatment [85]. Dynamic contrast enhancement patterns can be affected by a wide range of physiological factors which include vessel density, blood flow, endothelial permeability and the size of the extravascular extracellular space in which contrast is distributed [86,85]. The crucial difference from traditional medical imaging (i.e. X-rays) is that the DCE-MRI modality provides 3D spatial information about lesions as well as temporal information about lesion physiology (showing variations in contrast agent uptake rates), allowing for more accurate assessment of lesion extent and improved lesion characterisation [8].
Typically, DCE-MRI images are acquired with the use of a conventional gradient echo (GRE) pulse sequence to repeatedly image a volume of interest after injecting a contrast agent, such as gadolinium diethylenetriamine pentaacetic acid (Gd-DTPA) into the patient's blood stream. The DCE imaging employs a full k-space sampling strategy, (where the k-space relates to the associated wavenumber, a terminology originally established by the semiconductor industry). Three dimensional (3D) volume acquisition slice profiles are generally rectangular and slices are always contiguous. This means that signal derives uniformly from all tissue in each slice, without cross-talk.
One of the biggest advantages of the gradient-echo pulse sequence is that it can be performed quickly enough to enable 3D FT (3DFT) data acquisitions. In 3DFT imaging, an entire volume or slab of tissue is excited, rather than merely a thin slice of tissue. As a consequence, the 3D MRI data acquisition consists of three different phase encoding directions: the transaxial plane, the sagittal plane, and the coronal plane. Typically, 3D image sets are obtained sequentially every few seconds for up to 5-10 min. There is always a trade-off between spatial (sRes) and temporal (tRes) resolution. Usually, most radiologist's clinical protocols show a preference for scans at high sRes allocating only 1-2 min for tRes data acquisition [87].
A typical DCE-MRI dataset used for the image analysis consists of one baseline 3D MR image which is used as a reference before contrast agent injection. This is subsequently measured in a repeated manner to acquire post-contrast images at the second, third, till sixth time slices. Each time slice has a typical time interval equal to 60 s.
Following the terminology introduced independently by Ljunggren [88] and Twieg [89] we shall refer to the, k-space to denote the spatial (either 2-D or 3-D) frequency domain of the imaging system. Within the context of temporary image processing the k-matrix, is composed of digitized MR signals stored during the data acquisition process reconstructive computation steps. The complex data entries are associated to a pulse sequence of accurately timed radio frequency and gradient pulses.
In clinical practice e.g during a DCE-MRI mammogram, it is desirable to reduce the signal acquisition time; this is normally achieved by undersampling the k-space. This may achieved by adopting a random partial k-space updating [90] protocol. The HASTE sequence which samples half the k-space [91] is now routinely used in clinical MRI.
Aliasing from sampling the k-space below the Nyquist rate, however, introduces imaging artifacts. Since there is always a need for better tRes while preserving adequate SNR and sRes, several groups [92][93][94] have shown that it is possible to accelerate DCE-MRI (without employing parallel imaging) by a factor of ten using compressed sensing (CS) based image reconstruction as proposed in [24]. The approach allows filling of missing k-space data using a constrained optimization technique to interpolate the values between under-sampled adjacent data points in the spatial domain.
Recently work by Yin et al. [25] is based on the broad principles of compressed sensing and makes use of the fact that, when undersampling the k-space, it is possible to use variable density sampling schemes in a Cartesian coordinate system to widely distribute the resulting artifacts and reduce their visual impact. Such an approach was further explored using a model-based method for the restoration of MRIs with sparsity representation in a transformed domain, e.g. spatial finite-differences (FD), or discrete cosine transform (DCT). The reduced-order model, in which a full-system-response is projected onto a subspace of lower dimensionality, has been used to accelerate image reconstruction by reducing the size of the linear system associated with the measurement space. The singular value threshold technique [95] (SVT) was used in the denoising scheme to reduce and select the model order of the inverse FT image, and to restore multi-slice breast MRIs that have been compressively sampled in k-space. Restored MRIs with SVT de-noising show reduced sampling errors compared to direct MRI restoration methods via spatial FD, or DCT. The difference image related to Identify Transform (IT) shown in Fig. 5(b) contains a relatively large number of noisy (error) pixels that are located around the boundary of the imaged section. Reconstruction with IT shows also some blur at the image edges. In contrast, the reconstructed image using SVT denoising illustrated in Fig. 5(a) shows a reduced number of error pixels compared to the reconstructed image in Fig. 5(b).

Advantages and shortfalls of T-rays and DCE-MRIs
Advanced TPI and DCE-MRIs are becoming widely available in conventional clinical practice, where data acquisition and analysis are comparable despite inherent differences in signal acquisition and physicochemical basis of tissue contrast. One of the primary advantages of THz imaging over MRI is the potential specificity of a multitude of spectroscopic features especially if new contrast agents can be successfully developed. Unfortunately, the responses of many biological tissues are unknown in this band. In addition, a number of papers discuss spectroscopic investigations of biomolecules such as DNA [55,58,83]. An associated problem is the development of computer aided diagnostic algorithms for interpreting the multispectral images obtained by T-ray imaging [96]. A number of authors have considered this question by fitting the measured data to linear filter models and using the filter coefficients as a means to classify tissue types [97]. In this context, for example, one of the most important potential applications for THz technology is the detection and identification of biological and chemical agents [98]. Although T-rays can be used to image tumour microvasculature, more reports on the feasibility study using TPI are focused on imaging breast tumours [99,100] and skin cancer [101], due to the absorption of THz waves by water that is found in biological tissue. A recent review has been carried out by Yu et al. [102], where investigations relating to the potential of THz imaging and spectroscopy for cancer diagnosis has been highlighted. An important progress in THz tumour image was reported by Huang et al. [103] in 2006. In their work, they provided an in vitro demonstration of gold nanorods as novel contrast agents for both molecular imaging and photothermal cancer therapy. As a result of the strongly scattered red light from gold nanorods in dark field, observed using a laboratory microscope, the malignant cells are clearly visualized and diagnosed from the nonmalignant cells.
A further recent development in MRI is that of dynamic contrastenhanced magnetic resonance imaging (DCE-MRI). This refers to the acquisition of serial MRI images before, during, and after the administration of an MR contrast agent. Unlike conventional enhanced MRI, which simply provides a snapshot of enhancement at one point in time, DCE-MRI permits a fuller depiction of the wash-in and wash-out contrast kinetics within tumors. Depending on the analysis approach chosen, measurements may be derived from signal intensity data, or more commonly, signal intensity data are transformed into contrast concentration data prior to analysis [104]. In a THz imaging system, contrast is based on differences of attenuation, phase delay and dispersion of the THz pulse between normal adipose (fatty) tissue when compared to that of tumour tissues whereas in DCE-MRIs, contrast is based on the basis of observing the different degree of absorption of the contrast agent. Furthermore, because of the associated dynamics of the contrast agent, and the need to re-introduce a contrast agent if the imaging process is to be repeated, DCE-MRI systems provide a more limited temporal resolution. This can be a limitation in accurate tumour reconstruction after projecting 4D images to a 3D space as illustrated in the Section 4.

Analysis & classification of THz imaging data
T-ray pattern classification aims to select a sub-set of significant features in the image data set, without incurring a dramatic loss of information. Through pattern analysis, any relations, regularities, or structures associated with the measured T-ray response can be found. By detecting patterns, a T-ray classification system should be able to automatically make generalizations on its input space on the basis of its training set [105]. Such an approach is particularly useful within a laboratory automation context. Consequently, complete automated solutions should be seen as composed of three different modules that may be individually optimized for particular samples and data sets: the data acquisition imaging spectrometer module, the data de-noising pre-processing module and the classifier module. Depending on the type of sample, tuning may be tailored for each module before the learning process is initiated. Fig. 6. Image of an insect on an oak leaf obtained by TPI. The image is produced by plotting the peak value of the response for each pixel. (a) Raw image. (b) 3 dB SNR image corrupted by noise such that the SNR was. (c) Reconstructed image after de-noising using a Coiflet wavelet filter decomposition of order 4 where the average SNR is 10 dB greater than in (b). After [111].

Pre-processing for THz-TDS pulses
During pre-processing, input data vectors need to be grouped together into sets of feature vectors [84]. The choice of parameters for grouping is fundamental to the subsequent performance of the classifier. Data pre-processing aims to isolate the real Tray responses from the effects of amplitude and phase noise associated with the pulse to pulse stability of the source, laser beam pointing stability and detector shot noise, thus reducing artifacts that could compromise the classifier performance. Coaveraging multiple measurements by moving the scanning delay line uni-directionally several times through its central maximum (slow-scan mode) or changing the lock-in time constant (fastscan method) improves signal-to-noise-ratio per pixel but at the expense of significantly increased measurement time and image acquisition. Since improvement in signal-to-noise ratio is proportional to the square root of the number of co-averaged time-domain responses; this approach significantly limits the integration time that may be budgeted for each pixel. Furthermore the approach is unsuitable to the analysis of samples whose THz response is time-dependent e.g. drying. Signal processing can partly alleviate some of these issues. Window apodization for example reduces frequency domain Gibbs ripple due to the data discontinuities at the edge of the recorded time domain interferograms. Optimization of the apodization function is now possible using algorithms accounting for the asymmetry of the propagating femtosecond THz pulses [106].
Multiresolution techniques such as WTs are particularly effective to further de-noise mean-centered apodized interferograms. A typical de-noising procedure consists of decomposing the original signal using DWPT or DWT [107][108][109], thresholding the detail coefficients, and reconstructing the signal by applying the appropriate inverse transform (IDWT or IDWPT respectively). For the de-noising of femtosecond THz transients, a three-level decomposition is usually sufficient [110] and unnecessary computational load associated with more decomposition levels can be avoided. Fig. 6, illustrates the process [111].
An alternative promising pixel de-noising method involves the use of wavelet power spectrum estimation techniques (WPSET), as discussed by Kim et al. [112]. This approach may remove spectral artefacts without distorting spectral features in a more efficient manner. The author's application of WPSET to the transmission spectrum of water vapor verified the effectiveness of the approach. This is a nonparametric approach based on a wavelet representation of the logarithm of the power spectrum [113]. Alternative signal pre-processing methods for de-noising include base line correction [114], smoothing [115], first and second derivative [115,116], multiplicative scatter or (signal) correction [117], and standard normal variate analysis [118] Although all these methods have their own merit under different experimental conditions, one may argue that wavelet de-noising has the widest applicability [119].
From an imaging perspective, the discrete cosine transform (DCT) using Wang factorization, Lee's power of two block lengths DCT scheme, Arai's scheme, Loffler's algorithm or Feig-Winograd factorization [120][121][122][123] are particularly useful. The attractiveness of the DCT algorithm stems from the fact that it is asymptotically equivalent to the Karhunen-Loève transform which possesses optimal de-correlation as well as optimal energy compaction properties. Two-dimensional WTs are well established in multimedia coding standards (H.265 and JPEG2000) [124]. Almost real-time hardware implementation is possible nowadays using dedicated Field Programmable Gate Array (FPGA) technologies, with 45 nanometer complementary metal-oxide-semiconductor (CMOS) standard cells capable of operating at 7.6 Gpixels/second for an 8 × 8 block rate at more than 100 MHz [125].

Feature extraction methods
Each T-ray measurement with n data points can be viewed as a vector in a n dimensional space, known as a pattern space. The time sequence then appears as a point in the pattern space. The aim of feature extraction is to reduce dimensionality by converting preprocessed data to feature vectors. There are three considerations in the feature extraction and selection process: (i) the establishment of sound feature evaluation criteria, (ii) deciding upon the dimensionality of the feature space, and (iii) the choice of the algorithmic optimisation procedure adopted.
This analysis is carried out pixel by pixel, on the THz image. Feature vectors are grouped together via a decision function and then are evaluated to see whether they provide meaningful information. The features of interest upon which a classifier may be constructed are pulse height, shape, delays in the time domain, as well as the spectral content of the pulse in the frequency domain (after retaining both real and complex parts). Alternative parameters include the complex reflection coefficient commonly associated with an impedance mismatch and the complex insertion loss, commonly associated with the number of absorbers and extinction coefficient of the components in the sample. Differential transmittance and absorbance, with reference to a known sample can also provide contrast information that may be used for classification purposes.
Essentially, features should consist of parameters that display non-transformed structural characteristics: moments, power, amplitude information, energy, etc. as well as transformed structural characteristics: frequency and amplitude spectra, coefficients from wavelet decomposition, AR, ARMA coefficients, coefficients derived using subspace identification methods from state space analysis of the corresponding time series, etc.
Pre-processing techniques focusing on dimensionality reduction in the feature space are at the core of a successful pattern recognition system. Inclusion of more features improves classifier performance but compromise the generalization ability of the classifier. This is the well-known curse of dimensionality [126], which becomes quite prominent if the number of features is above 30.
The fast WT may be adopted to achieve effective feature extraction. The features presented to the classifier, in this case become the extracted wavelet coefficients. The use of Auto Regressive (AR) and Auto Regressive Moving Average (ARMA) models on the WTs of measured T-ray pulse data has been previously discussed elsewhere [97]. The features of a processed THz signal are eventually classified by an Mahalanobis distance classifier. The effectiveness of this method is demonstrated via cancer cell discrimination from normal tissue and on the problem of recognising different kinds of powders. Table 1 Advantages and drawbacks of linear transforms.

Transform
Advantages Disadvantages PCA Maximum information compression Each PC is related to the whole spectrum. no a priori assumption on a particular model associated with the dataset structure so no artefacts. "Problematic" regions in the spectrum cannot be excluded after the computation of the PCs.
Since the analysis functions are obtained from the statistics of the data, many calibration samples may be required to obtain reliable PCs. Fourier The analysis functions are fixed. Spatial information is lost. Problematic regions in the spectrum cannot be excluded after the transform. Compacting a waveform using Fourier coefficients is not as efficient when samples have complex spectra and filtering too many Fourier coefficients can lead to signal distortion. Wavelet Spatial information is kept. The width The choice of the mother wavelet is usually of the analysis window is difficult. automatically varied. For best performance, adaptive wavelets should be considered but the optimization methodology can have significant impact on classifier performance

Linear transforms for feature extraction
Linear transforms are useful both for noise extraction and for representing the information in the data using fewer coefficients. Noise extraction can be performed by assuming that the system is detector noise limited rather than source noise limited. In this case, the noise spreads equally among all transform coefficients, while useful information will generally be concentrated in a few coefficients. Examples of commonly used linear transformations for the processing of spectroscopic data include the FT, windowed FTs, WTs, and principal-component analysis (PCA). A comprehensive evaluation of various linear transforms that may be used for the denoising of spectra from continuous wave THz spectrometers can be found in [65], the main conclusions of the work are summarized in Table 1.
An example of using spectral features to perform classification of TPI signals is discussed in Yin et al. [127]. The amplitude and phase at certain key frequency components constitute pairs of feature subsets on which the classification is based. An important advantage of this approach is the small dimensionality of feature vectors. This allows the features to be directly extracted from pulsed responses with relatively low computational complexity. Fig. 7 shows the phase and amplitude plots in the frequency domain for six different powder samples: sand, talcum, salt, powdered sugar, wheat flour, and baking soda. Each curve is associated with a single pixel sampled from the image data. The spectrum has a cut-off frequency at 3 THz. Sharp changes of amplitude at the second frequency bin may be observed in Fig. 7(a). It can also be seen that samples have significantly different frequency dependent phase patterns, so that a classifier using this information can be implemented illustrated in Fig. 7(b).
Zhang et al. [128] proposed two feature extraction methods to avoid misplacement phase error in THz reflection time-domain spectroscopy (THz-RTDS). In their work, the first or second order derivative of the phase of the relative sample reflectance was used to extract the frequency dependent absorption signatures of the materials under study. Ryniec et al. [129] applied decision trees as a feature selection method to identify THz spectra of different compounds, demonstrating the effectiveness of decision tree methods in the classification of THz spectra.
Generally, Fourier expansions suffer from drawbacks such as the notion of an infinite support in the time domain [130], which compromises the quality of the signal unless apodization routines are adopted [65] to eliminate Gibb's ripple resulting from discontinuities at the edges of the time domain interferogram. This is especially true when fast scan data acquisition is performed in TPI applications where the time domain signals are more truncated.
These drawbacks are more efficiently addressed through the use of windowed FTs that further reduce the number of coefficients needed to describe the transformed dataset in featureless parts of the spectrum as well as through the WT which addresses the issue by successively increasing the resolution (increase in scale) of both the temporal and frequency domain features of the TPI signal.

Wavelet coefficients as feature sets for THz pattern analysis
The objective of feature extraction techniques is to isolate the relevant features from the T-ray signals to improve classification performance. WTs complement the traditional Fourier-based techniques in THz signal analysis by providing superior timefrequency localization characteristics that are well-matched to the requirements for the short-duration T-ray pulse signals. Stephani et al. [131] used wavelet coefficients to extract features from hyperspectral THz-TDS datasets. The approach enabled a coarse pre-clustering operation representing the target images successfully. An alternative approach is through the wave atoms transform (WAT) which was first introduced by Fu et al. [132] in the context of THz transient processing of reflectance signatures. This is a multiresolution technique that has a sparser expansion for oscillatory and oriented sample textures. It can provide improved resolution for pattern identification, when textural artifacts contaminate the THz transient response is concealing the compositional absorbance or reflectance of the sample.
An alternative to occasionally generate even more parsimonious feature matrices, reported by Yin et al. [97], assuming AR, MA and ARMA models of different order, may also be considered, depending on the data structure. In that approach, the averages of the modelling coefficients, (denoted as DC values in Fig. 7), are computed over the three decomposition levels of the WT employed on each data set. The model coefficient averages are then joined to produce feature vectors with a dimension equal to the number of sub-bands in the adopted wavelet decomposition. The feature vectors obtained from two different AR orders, and MA orders may be combined respectively, to form the final AR and MA feature matrices. The ARMA feature matrix is obtained by combining two different orders of AR and MA vectors together. The extracted AR and MA feature vectors are calculated at each decomposition level j. This approach may be introduced to further suppress the number of features in the input of the classifier so as to further improve   Fig. 8. In this modelling, H and G denote the low-and high-pass filters, respectively, w f is the de-noised T-ray input. The arrow depicts the diadic down-sampling operator. Similar illustration related to DC AR and DC MA feature matrix are assumed.
its generalization ability. The complete procedure for calculating DC ARMA j is depicted in Fig. 8. An alternative to time-frequency analysis by WTs and further compression using MA routines is through the use of the Radon transform. Such approach has been used to successfully identify micro-Doppler motion of a target and has its origins in the highresolution radar identification community. Xu et al. [133] suggested a combination of time frequency analysis with the Radon transform to perform micro-feature extraction but the approach incorporates echo-cancelation that often leads to an undesirable channel spectrum in the frequency domain, a common source of error in TPI imaging that can significantly compromise classifier performance. Alternative feature extraction procedures include PCA, as discussed in [99] and [65]. The usual problem with such an approach is the necessity to have a reliable set of calibration samples, which is often unfeasible.

Pattern classification
In signal processing, pattern classification refers to the separation of patterns, measured or observed, into small classes, and then the assignment of each pattern to a particular class. The classification scheme is usually based on training sets that have already been successfully classified (supervised learning strategies). Learning can also be unsupervised, but such approaches usually fall short from a biomedical software certification perspective. The classification or description schemes that follow are mostly based on statistical (or decision-theoretic) approaches such as the Mahalanobis distance classifier, the Euclidean discrimination matrix, Support Vector Machines (SVMs) and Extreme Learning Machine (ELM) classifiers.

Feature based Mahalanobis distance classifiers
The Mahalanobis distance classifier [134] is a type of minimum distance classifier, which is optimal for normally distributed classes with equal covariance matrices (linear discriminant) and equal a priori probabilities. Such classifier is often chosen because it is simple to implement and it provides reasonable results for a variety of biomedical waveforms. One possible approach is to formulate the Mahalanobis classification scheme on a set of feature matrices of ARMA modelled datasets after signal decomposition in wavelet subbands [97]. For a given class, m, the distance from a feature matrix DC l j to the class mean A m , is defined as where C m is the covariance matrix of the feature vectors regarding class m, DC l j with l = 1, 2, 3 represents the averaged coefficients matrix related to AR (l = 1), MA (l = 2), and ARMA (l = 3) modeling of wavelet approximation coefficients at three decomposition levels j, that is, DC 1 j being DC AR j , DC 2 j being DC MA j , DC 3 j being DC ARMA j . In practice, the covariance matrix is estimated from the training vectors. During classification, the minimum Mahalanobis distance from feature matrix DC l j to each class centre A m is used to assign the appropriate class label.

Support Vector Machine classifiers (SVMs)
Kernel based learning and SVM methodologies reside at the core of a range of interdisciplinary challenges. Their formulation shares concepts from different disciplines such as: linear algebra, mathematics, statistics, signal processing, systems and control theory, optimization, machine learning, pattern recognition, data mining and neural networks. The idea of the SVM is to map data from the input space into a high-dimensional feature space, in which an optimal separating hyper-plane that maximizes the boundary margin between two classes can be established. At its core, SVMs are two-class classifiers. Recently, SVMs have been extended to solve multi-class classification problems from noisy biomedical measurements. Furthermore, there are several reports discussing the use of SVMs for THz material identification. Pan et al. [135] used SVMs to classify THz absorption spectra for the purpose of illicit drugs identification. They successfully identified seven pure illicit drugs establishing the methodology as an efficient method for drug identification. Fitzgerald et al. [99] applied the SVM approaches combined with a radial basis function to discriminate normal from malignant breast tissue from THz-TPI. Yin et al. [127] applied SVMs to perform multi-class classification of THz powder spectra for six types of powder materials with similar optical properties. Fig. 9 illustrates the multi-class separation for the six types of powder substances using SVMs, which are designed according to a pair wise-strategy. One real Gaussian kernel with C = 1000 and = 1 ×10 −7 is used to map the input data into a 2D Fourier feature space for visualisation purposes. The support vectors indicated by cyan circles are subsets of the training data sets and are used to construct a two-dimensional hyper-plane in feature space, which acts as a boundary separating each class of different powder materials.

Complex valued ELM classifiers
Our recent work [136] extended a very important class of recently developed classifiers called Extreme Learning Machines (ELMs) to complex valued problems [137,138]. The motivation for the proposed extension stems from the fact that the real valued ELM has shown some of the lowest training errors among machine learning algorithms and in particular SVM classifiers [139,140,127,130]. Furthermore, existing machine learning techniques are focused on real valued datasets. Traditional amplitude only based pattern mining approaches of spatio-temporal images has classification limitations in samples that display highly correlated time-space features. The complex-valued ELMs (CELM) adopts induced complex RKHS kernels [141] to map inputs from complex-valued non-linear spaces to other real valued higher dimensional linear spaces. This permits us to classify the inputs with linear complex valued feature vectors i.e. preserving the information in the phase of the signal (dispersion). The approach is based on concepts developed for quaternary variables classification, through the introduction of two complex-coupled hyper-planes [142]. A widely linear estimation processing approach is adopted and the argument composed of the sum of the two parts (real and imaginary) is employed to relate the input feature space to the output feature space through the hidden layer of the classifier. The approach enables us to define a kernel function specific for the separation of the data in high dimensional complex coupled hyper-planes. The approach is compatible with the processing of datasets in tensorial format which enable additional image features (hyperspectral, amplitude, phase, polarization or spatiotemporal components) to be simultaneously retained.
The CELM classifier approach has a very broad applications domain across the biomedical community encompassing all types of research associated to the study of the interaction of matter with waves, and in particular spectroscopy (acoustic, dielectric, optical, THz, infrared, electron-spin resonance, nuclear magnetic or paramagnetic resonance, etc.) as well as imaging and tomography modalities encountered across the physical, chemical and biomedical disciplines. It is thus fundamental both from a machine learning as well as from a chemometrics perspective [143]. Because spectroscopic responses are analogous to the blurring function (relating amplitude and phase) developed by Bode [144] to describe the dynamics of physical systems, CELMs have a wide range of applications across all physical sciences. A typical example of the proposed approach is the recent study [145] which focused on the use of CELM to perform binary and multi-class classification of RNA and powder samples respectively on the basis of images acquired by THz-TPI. The analysis was performed on large data sets as would be the case in a typical biomedical or quality control setting. Classification was performed on the basis of discernible features in the measured THz spectra. Examples of learning vector patterns for multiclass recognition via CELM, are shown in Fig. 10(a) after FT of the time-domain signatures and extraction of the corresponding complex valued features in frequency domain, regarding phase and amplitude. We used 49 input vectors related to each powder sample for training the classifier. Two real RKHS kernels were used for mapping. The optimal Gaussian parameter of was set to 100 and the penalty parameter C was set to 0.1. The labels were complex-valued and produced 12 output classes. Background colour shows the contour shape of the decision surface, (these are numbered from 2 to 12), these correspond to the amplitude calculations derived from the sum of real and imaginary values of the respective complex labels. It can be observed that THz measurements of powder samples of salt, sand, talcum, are grouped more tightly than the powder samples of flour, soda and sugar. The labelled contours that correspond to different real and imaginary parts (the real and imaginary parts label the different classes) are illustrated in Fig. 10(b). These regions are undecided in the classification process and are therefore excluded to avoid over-fitting problems.
The CELMs may be naturally extended to multi-pixel or voxel images. This aims to achieve complex valued learning of 3D inputs of complex valued features, i.e., to classify the complex valued input data selected from a tensor. The proposed approach will address aspects of quaternary classification within a tensor algebra context. For 3D inputs, three pairs of complex coupled hyperplanes will be designed through orthogonal projections. The approach enables us to define kernel function specific for the calculation of high dimensional complex coupled hyper-planes. It allows effective classification of a more natural representation of the data in a tensor format. The approach is also extendable to hierarchical clustering as discussed in the following section.

A further example for cancer diagnosis using THz imaging and pattern classification algorithms
Eadie et al. [146] carried out multi-dimensional THz imaging analysis for colon cancer diagnosis. Their research uses decision trees to find important parameters, with neural networks (NN) and SVMs that are used to classify the THz data as indicating normal or abnormal samples. This work finds the sensitivities of 90-100% and the specificities of 86-90% in the colon cancer diagnosis. Thus, the  use of THz reflection imaging allows the colon cancer samples to be detected on the basis of either true or false positive or true or false negative diagnosis outcomes derived from specificity / sensitivity metrics according to the Table 2 below, where TN and TP signify true negative and true positive respectively and FN and FP a false negative or false positive outcome.

Clustering techniques to segment THz images
Clustering, also termed cluster analysis is the formal study of algorithms and methods for grouping unlabelled data into subsets (called clusters) according to measured or perceived intrinsic characteristics or similarity. Clustering deals with data without using category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The two most frequently used clustering techniques are the k-means and the ISODATA clustering algorithm. Both of these algorithms use iterative procedures in their cluster estimation process. In general, both of them assign first an arbitrary initial cluster vector. Then each pixel is classified to the closest cluster. Finally the new cluster mean vectors are calculated based on all the pixels in one cluster. The second and third steps are repeated until the change between the iteration is small. The change can be defined in several different ways, either by measuring the distances that the mean cluster vector have changed from one iteration to another or by the percentage of pixels that have changed between iterations. The ISODATA algorithm has some further refinements such as the option of splitting and merging of clusters [147]. Clusters are merged if either the number of members (pixel) in a cluster is less than a certain threshold or if the centers of two clusters are closer than a certain threshold. Clusters are split into two different clusters if the cluster standard deviation exceeds a predefined value and the number of members (pixels) is twice the threshold for the minimum number of members.
Currently, several papers report clustering techniques to segment THz images. Brun et al. [148] report on THz-TDS imaging of 10 m thick histological sections, where clustering methods are used to cluster THz spectral images that are produced on the basis of the extracted refractive index data. The results show that THz spectral differences exists not only between tumor and healthy tissues but also within tumors. Ayech and Ziou have discussed k-means clustering methods for segmentation of THz imaging. In [149], a combination of autoregressive (AR) model and PCA is proposed to extract effective temporal/spectral features from TPI before carrying out soft decision of Kharmonic-means (KHM), which outperform the algorithms based on hard decision of traditional k-means methods. In [150,151], a novel approach of segmentation of THz images is proposed, where the k-means technique is reformulated under a ranked set sample. Their approach consists of an estimation of the expected centers, selection of the relevant features and their scores, and subsequent classification of the observed pixels in the THz images. This method is essentially less sensitive to the initialization of the centers. A more recent research suggested in [152] uses a two-step partitioning clustering approach to segment THz measurements of the inner structure of cave bear teeth. A k-nearest neighbor graph that is built on the reduced channel information [153] is further split into segments by a minimum edge cut bi-sectioning method. The results show that the layerlike structures are discernable within the material, giving a more detailed image of the inner structure of the tooth. Using the ISO-DATA algorithm to cluster THz spectra for THz image segmentation was first suggested by Berry et al. [154]. In their work, two specimens were examined in this pilot study, one of basal cell carcinoma and one of melanoma. Unsupervised ISODATA classification using three selected parametric TPI was compared qualitatively with kmeans classification using the shape of the entire time series, and contrasted with conventional stained microscope slides. There was good qualitative agreement between the two classifications methods. As expected, classifications were consistent with the observed morphological characteristics of the tissue. The results have implications for the future development of feature sensitive technique as the need for only a small number of features could lead to considerably reduced acquisition times.

Analysis of DCE-MRI data
Since 1995, DCE-MRI screening has been steadily gaining ground as a preferred potentially highly sensitive new modality for clinical applications [155]. Because of its high 3D resolution and its ability to acquire kinetic contrast information, it has steadily gaining popularity over traditional diagnostic techniques such as mammography or ultrasound [156]. For breast tumors, lesion diagnostic sensitivities can reach 97% [157].
However, specificity of breast DCE-MRI is still rather low, with rates of between 30% and 70% [158,159]. High false positive detection rates on MRI often lead not only to anxiety for the patient but may also result in an unnecessary invasive biopsy [156,158]. This hinders its use as a routine imaging technique in breast cancer patients. Benefits of breast MRI screening include a potentially improved breast cancer detection rate for high-risk women and the provision of additional information regarding disease proliferation (through multiple scans).
Computer-aided diagnosis (CAD) approaches for breast MRI are typically employed for automatically identifying tumors from normal tissues when these are at a stage of rapid development [160,13,161,8] whereas the more complex task of classifying a lesion as benign or malignant [162,160,[163][164][165][166]20,167,4,168,169] is proving more difficult to address. In clinical practice, in order to enable the interpretation of patterns resulting from contrast enhancement across a series of MRI volumes, the intensity changes per voxel are color-coded by an automated kinetic assessment protocol. However, the technique is not fully automated and requires continuous feedback from experts. A major challenge in the diagnosis of breast DCE-MRI is the spatiotemporal association of tumour enhancement patterns, a task that humans are not as optimized to perform [4]. With many CAD systems now available commercially, Pan et al. [170] evaluated which system can best help detect signs of breast cancer on breast MRI. The most commonly used CAD systems in the USA are CADstream (CS) (Merge Healthcare Inc., Chicago, IL) and DynaCAD for Breast (DC) (Invivo, Gainesville, FL). Their primary objective in this study was to compare the CS and DC breast MRI CAD systems for diagnostic accuracy and postprocessed image quality. The experiments were aimed at evaluating 177 lesions in 175 consecutive patients who underwent second-look ultrasound guided biopsy or MRI-guided biopsy. The results illustrate that the two CAD systems had similar sensitivity and specificity (CS had 70 % sensitivity and 32 % specificity whereas DC had 81 % sensitivity and 34 % specificity). Both CS and DC had a high sensitivity for detecting malignant lesions on breast MRI; however, neither system significantly improved specificity for the diagnosis of benign lesions. The ROC curve plots using both CS and DC systems are illustrated in Fig. 11.

Outlook for future tensorial and Clifford algebra based feature and image registration
In dynamic pattern recognition methods for the analysis of DCE-MRI, the emphasis has been on either high temporal resolution and empirical analysis [93,171] or on high spatial resolution with a stand-alone morphologic feature extraction [171,172]. Time-series analysis is a time-consuming task due to spatiotemporal lesion variability. Changes in spatial intensity of imaged tumours are a further complication as they cause an inherent difficulty in segmentation of an object of interest [173]. For example, Fig. 12(a) depicts an imaged ductal carcinoma in situ (DCIS). While the parts depicted by the arrows show the same anatomical structure taken from the same tumour region, the intensity values are different. The intensity indicated by a yellow arrow is higher than the intensity indicated by the red arrows. After conducting intensity based segmentation as illustrated in Fig. 12(b), the region with low intensity may feature as a gap separating the image into two disconnected parts. The gap forms an area without edge. A multi-channel classification method that considers the associations between spatial and temporal features of high-dimensional images is proposed in order to achieve accurate diagnosis of tumour tissues. The detection of anomalies in spatiotemporal data is an emergent interdisciplinary topic that involves highly innovative computer science methods. Mining spatiotemporal patterns is critical for the correct identification of tumour anomalies in DCE-MRI and yet it remains challenging because of complexities in analyzing the time-series of 3D image data.
Recently, tensor decomposition of high-dimensional medical image data, i.e. fMRI, has also gradually drawn attention since it can explore the multi-way data's structure which exists inherently in human organ imaging [174]. Tensors are multimode (multiway) arrays, where vectors (i.e., one-mode tensors) and matrices (i.e., two-mode tensors) are special cases. The tensor representation captures useful information that is difficult to provide in a conventional vectorial formalism e.g. accounting for specific morphological features such as directional striations in vessels. To effectively utilize the rich information contained in tensors, we propose to extend the CELM for effective tensor classification. Since most standard learning algorithms assume data instances are feature vectors, it is not straightforward to apply these algorithms on tensorial data. The CELM method enables the identification and learning of inter-mode relations of features.
Unlike classic SVMs, the complex valued hyper-planes of CELM are calculated using the smallest norm of output weights with the smallest training error in a similar manner as in ELM. The technique discards the normal thresholding procedure found in traditional SVM classifiers, without calculating support vectors. The extended CELM has significant potential to solve complex valued problems for multiclass classification of tensorial datasets with dramatically reduced computational complexity and significantly improved computational speed. It enables the classification of tensorial data while preserving information associated with adjacent and overlapping data vectors as well as differentially extracted features.
Registration of images is a crucial step in many image processing applications where the final information is obtained by combining multiple input images. In many applications multichannel images are also available, requiring innovative processing of vector data. Traditional approaches in achieving multi-channel image registration can cause inaccuracies by introducing information loss or misinterpretations and may lead to inappropriate results. An alternative way to perform registration of multichannel images described by associated vectorial datasets is through the use of Geometric Algebras (such as Clifford Algebra). The main advantage of this algorithm is that it directly operates on the multichannel signal, instead of scaling the signal down to one dimension (e.g. by averaging) and thereby loosing a lot of information. A further advantage is that it enables fusion of datasets from heterogeneous sensing modalities, thus allowing for future integration through progress in biomedical sensing.

Performance measures
Evaluation of diagnostic tests is a matter of concern in modern medicine not only for confirming the presence of disease but also to rule out the disease in healthy subjects. Any diseased tissue detection process based on DCE-MRI screening provides a voxelbased classification result. From the frequency of test results among patients with and without disease based on the gold standard, any voxel in MRIs can be classified either as diseased or surrounding tissue. Consequently, there are four possibilities; two classifications and two misclassifications. The classifications are the true positive (TP) and the true negative (TN) where the number of tumour voxels and background voxels which are correctly detected respectively; the false positive (FP) is the number of pixels not belonging to a vessel, but is recognised as one, and the false negative (FN) is the number of pixels belonging to a vessel, but is recognised as background pixels, mistakenly.
One can further derive the probability of a positive test result for patients with disease and the probability of negative test results for patients without disease. The true positive rate (TPR) represents the fraction of voxels correctly detected as diseased voxels. The false positive rate (FPR) is the fraction of voxels erroneously detected as diseased voxels. The accuracy (Acc) is measured by the ratio of the total number of correctly classified voxels (sum of true positives and true negatives) to the number of voxels in the image field of view. Sensitivity (SN) reflects the ability of the algorithm to detect the diseased voxels. Specificity (SP), which can be expressed as 1-FPR is defined as the ability to detect non-diseased voxels. The positive predictive value (PPV) gives the proportion of identified diseased voxels which are true diseased voxels. The PPV is the probability that an identified diseased voxel is a true positive.
A receiver operating characteristic (ROC) analysis has become a popular method for evaluating the accuracy of medical diagnostic systems. The ROC curve plots the fraction of diseased voxels correctly classified as diseased tissues, namely the TPR), versus the fraction of non-diseased voxels wrongly classified as diseased voxels, namely the FPR). The better the performance of the system is, the closer it resides to the upper left hand corner of the ROC space. The most frequently used performance measure extracted from the ROC curve is the value of the area under the curve (AUC) which is 1 for an optimal system. For MRI images, the TPR and FPR are computed considering only voxels inside the field of view. Table 2 summarizes the performance metrics used by DCE-MRI image segmentation algorithms.

Extensions to multi-channel classifiers
By using a tensor algebra or Clifford algebra framework analysis of spatiotemporal associated features becomes possible, such advances therefore lead to the development of a multidimensional unified MRI framework for processing DCE-MRIs. Mining spatiotemporal associated features of lesions from MRIs can increase the accuracy and efficiency of pattern identification. Current DCE-MRI is not sufficiently accurate for the early detection of tumours because of a lack of association between the spatial and temporal features.
In this section, we introduce a novel dynamic tensor reconstruction algorithm aimed at principal component separation performed on the basis of an offline tensor analysis algorithm (OTA) in combination with PCA. This focuses on the analysis of dynamic projection matrices for principal component separation of cancerous and healthy tissues. The tensorisation of DCE-MRI is reconstructed via multidimensional unified analysis of MRI data according to tensor factorization. One of the advantages of such reconstruction is the incorporation of the temporal information into spatial voxels and projecting four dimensional time-spatial vectors into a three dimensional space that shows spatial and temporal information fusion with decreased number of dimensions for reduced computation cost.

Image registration of DCE-MRIs
In DCE-MRI, there are (i) spatial motion artefacts caused by patient movement, respiratory motion, intestinal peristalsis and cardiac pulsations during data collection [175][176][177], and (ii) signal intensity changes in T1-weighted images when the contrast agent diffuses out from the vascular tissue and accumulates in the interstitial space. These signal intensity variations lead to contrast agent concentration estimation errors which can further amplify errors in pharmacokinetic models of tissue blood volume and vascular permeability compromising evaluations of therapeutic response [178]. Proper registration of pixels in the chosen co-ordinate frame is a critical step in the data acquisition process as uncorrected voxel displacements from the motion artefacts will corrupt the voxel information.
In DCE-MRI data, there are also further challenges as time progresses after compound injection. Both rigid (alignment using only translation and rotation) and non-rigid algorithms (associated with more complex deformations) have been proposed for image registration, i.e. in DCE-MRIs of kidney [179], breast [180], liver [181], lungs [182,183] and the heart [184]. Reviews discussing advances in DCE-MRI image registration can be found in [185,186,175]. A conceptually straightforward rigid transformation is through manual delineation of the volume images of the target object after aligning the centers of gravity [187]. An automated feature-based algorithm has been presented by Song et al. [188]. In this work, waveletbased edge detection is followed by the computation of a geometric transformation based on a FT. Zikic et al. proposed a locally rigid registration algorithm with a gradient-based similarity measure to allow for global changes in kidney enhancement [179]. Another approach is to register the images by optimizing the fit of the enhancement curves to a pharmacokinetic model [189] and [190]. Nonrigid algorithms include a vertical, deformable transformation minimizing a cost function which suppresses motion and smoothes the enhancement curves [191] while at the same time maximize the mutual information using a cubic B-splines deformation [180,192]. More recent approaches to registration aim to incorporate additional a priori information based on specific anatomical markers [193], volume preservation of tissue [194] or local rigidity assumptions [195]. Schäfer et al. [196] propose a regional segmentation approach to study breast tissue lesions taking into consideration whether there was an observed similarity in the tissue perfusion characteristics, thus improving on single voxel-based approaches [197]. This approach has also additional advantages from a clinical diagnostics perspective.
Current literature [181] suggests that there are advantages in non-rigid registration when compared to rigid registration. For non-rigid registration, deformable image registration of DCE-MRI time series is accomplished using (normalized) mutual information (MI) [177,198] approaches. Normally, the images contain edge information between various tissue types. A gradient dependent cost functional is proposed for registration. In recent work, it was shown that normalized gradient fields (NGF) provide a viable alternative to MI for the registration of DCE-MRI images [175].
An alternative approach to non-rigid motion correction uses a Bayesian framework [199] to provide pharmacokinetic parameter estimation in DCE-MRI sequences. A physiological image formation model is used to provide the similarity measure used for motion correction. Hodneland et al. [175] compared a normalized gradients approach with the mutual information approach for motion correction of DCE-MRI datasets and showed that using cost functions based on normalized gradients can successfully suppress artifacts from moving organs in clinical DCE-MRI records.
An alternative approach, proposed by Lin et al. [200] discusses a respiratory motion-compensated DCE-MRI technique using k-space-weighted image contrast (KWIC) radial filtering. The technique combines the self-gating properties of radial imaging with the reconstruction flexibility provided by the golden-angle view-order strategy. The signal at the k-space center is used to determine the respiratory cycle, and consecutive views during the expiratory phase of each respiratory period are grouped into individual segments. The principle is to divide k-space into concentric rings. The boundary of each circular region is determined by the Nyquist criterion, after assuming that the views within each region have uniform azimuthal spacing.
The feature extraction algorithms mentioned earlier are relevant to both medical image registration as well as motion compensation [201,202]. An alternative approach to localize anatomical features in DCE-MRI is through the use of level sets, an approach originally proposed in [203,204]. The method is applicable to post-contrast enhanced MR images to delineate the variable shape of features of interest. Yin et al. [205] proposed such approach to localize anatomical features in-breast costal cartilage imaged using DCE-MRI. The contours in each layer are cumulatively added to the first contour to produce the results illustrated in Fig. 14(a). The shape of the feature of interest clearly varies from layer to layer. The variable shape of contours acquired from a level-set-based segment image actually determines the feature region of interest. This is subsequently used as a guide to specify initial masks for feature extraction. Fig. 14(b) shows the superposition of the mask and the level-set based projection of Fig. 14(a). The motion action of the fourth pair of breast costal cartilages are obtained by reprojecting the resultant segments from transaxial planes to sagittal planes. Rotational motion artefacts in the DCE-MRI are illustrated in Fig. 14(c).

Pattern identification of spatiotemporal association features of tumours in DCE-MRI data
One of the current challenges in breast DCE-MRI as a screening modality is reducing false positive detection errors, thereby boosting detection specificity. Computer-aided diagnosis (CAD) approaches for breast MRI are typically employed for automatically identifying tumors from normal tissues when these are at a stage of rapid development [160,13,161,8] whereas the more complex task of classifying a lesion as benign or malignant [162,160,[163][164][165][166]20,167,4,168,169] is proving more difficult to address. In dynamic pattern recognation methods, the emphasis has been on either high temporal resolution and empirical analyses [4,[17][18][19][20] or on high spatial resolution with a stand-alone morphologic feature extraction [166,169,161,8,4,9]. Even though time-series analysis enables radiologists to infer information regarding the tissue state, such assessment is a time-consuming task, because of spatiotemporal lesion variability. Currently, most studies consider aggregate measurements for tumour morphological characterization [4,161,169] with an initially model-free [161,169] and data-driven [13,160] segmentation according to manually marked region-of-interest (ROI).
Common practice in these methods is to process the imaged 3D volumes separately, and then incorporate the temporal information into the spatial databases through a separate processing step. Image reduction based feature extraction enables identification on the basis of the dominant features present in the image. For example, in [14,15], PCA was applied on enhanced and scaled datasets for a whole 2D object region obtained by DCE-MRIs. This is in contrast to traditional PCA applied in two-dimensional MRI image analysis ignoring any spatial information associated with a time series evolution of disease progression. To address the issues of low specificity and high inter-observer variability in breast DCE-MRI, the analysis of spatiotemporal patterns remains challenging [4] and requires the development of new software tools.
Representation of multi-dimensional features in a tensor space is a relatively new concept in the computer science and pattern recognition literature. The motivation of using tensor decomposition is to explore via multimode data analysis of image arrays the spatiotemporal correlations of sequences in DCE-MRI images.
Recent work [206] shows that there is potential to identify tumour shape by combining non-negative tensor decomposition and directional texture synthesis. The approach uses symmetry information about 3D shapes that is represented by 2D textures synthesised from sparse, decomposed images.
Tensorial analysis is directional so interactions of components within the associated matrices provide additional degrees of freedom for data analysis, enabling spatiotemporal data correlations to be made along each co-ordinate direction as shown in Fig. 15(a). Isolation of such correlations in each co-ordinate plane can provide a clearer picture of disease proliferation. A third order tensor that may be associated with a DCE-MRI dataset is illustrated in Fig. 15(b)-(d). Fig. 15(e) illustrates the way to flatten the third order tensors along frontal slices.
Tensor factorisation of a 3D spatial matrix uses multilinear algebra to analyse an ensemble of volume images, in order to separate and parsimoniously represent high-dimensional spatial datasets into constituent factors [207]. The 3D spatial image datasets are treated as a third order tensor. The image dataset tensor A (3) ∈ R I 1 ×I 2 ×I 3 is decomposed [208] or factorised to a core tensor C ∈ R J 1 ×J 2 ×J 3 and three different modes of 2D image matrices X (n) ∈ R In×Jn , n = 1, 2, 3, as illustrated in Fig. 15(f).
In our recent research [206], we explored tensor decomposition for the identification of shape with mirror symmetry. We conclude that if both the first mode matrix (i.e. along the y and the z axes) and second mode matrix (i.e. along the x and the z axes) are symmetric, the frontal plane (along the x and the y axes) is a mirror symmetric plane, and vice versa.
Spatial shape datasets with a simple geometry are used for illustration purposes (Fig. 16). These images show a three-dimensional mirror symmetry analysis of a spherical object with a radius of 31 pixels. Fig. 16(a) illustrates that the flattened basis images are cropped in the middle area after non-negative tensor decomposition of the sphere. As an example, reconstruction is performed using tensor multiplication of the core tensor, on the basis of first mode  and second mode matrices. Fig. 16(b) illustrates sparse texture extraction of the spherical object. Fig. 16(c) shows a synthesis of the extracted texture regarding the sphere. The resultant synthesised image is symmetric with both vertical and horizontal symmetric axes, which means the object is symmetric with the frontal plane as a reflective mirror. In a second example, we explore the mirror symmetry of brain structural MRI of white mass with rough resolution. The MR image size is 47 × 58 × 43. Fig. 17(a)-(c) illustrates one of the 2D crosssectional slices along a horizontal plane, vertical plane, and frontal plane. Fig. 17(d) illustrates a brain slice image with asymmetry along an x-y plane.
The generated 3D images with symmetry and asymmetry are assembled into the third order tensors and non-negative tensor decomposition is applied to factorise the non-negative tensors to factors, with the core tensor size of 32 × 32 × 32 and the flatten basis    (Figs 18c and 18d). The first, second, and third asymmetric MR images (with 15, 25, and 35 asymmetric layers respectively) were generated using the proposed methodology.
image size of 58 × 1024. The center region of the basis images is cropped to a size of 58 × 180. Fig. 18(a)-(d) illustrates the resultant synthesis for the symmetric MRI, and the first, second, and third asymmetric MR images (with 15, 25, and 35 asymmetric layers respectively) subsequently generated using the proposed methodology.
To ascertain the degree of symmetry (or the lack of it), (i) k means clustering is used to group the synthesis patterns and find the associated 2-dimensional geometric pattern; (ii) histogram images from the synthesis patterns are used to evaluate the degree of intensity symmetry in the image. In this way, the analysis of 3D shape can be mapped into a 2D space, therefore, performing the required dimensionality reduction. The current study thus proposes a way forward towards addressing the challenges associated with tensor decomposition of MRIs for abnormality detection in simulated breast tumors and in brain structural MRI. This is currently most relevant to MRI in clinical practice, but can benefit the TPI community, especially if such systems are to soon undergo further clinical trials. If systematically implemented, the proposed advances are applicable to both THz as well as MRI imaging modalities and are likely to provide in the near future improved diagnosis for Alzheimer's disease and may also assist in the early diagnosis of dementia by translating current understanding in cell biology into therapeutic advances [209][210][211].

Conclusion and future work
This review considered commonalities in datasets acquired using TPI and DCE-MRI measurement modalities. Both approaches are currently being explored as viable alternative imaging modalities to assess disease proliferation in a non-invasive manner. DCE-MRI is well established and as such it is regularly used in clinical environments. In contrast, TPI has yet to gain popularity although there is a general recognition of its potential to provide complementary information to clinicians.
The TPI scans provide information from individual pixels that show wavelength dependent attenuation, dispersion and phase delay according to the state of hydration of the tissue. Specific vibrational signatures may also be identified in the frequency domain after the Fourier transformation of the time-domain data. Similarly, MRI datasets are based on observations of de-excitation lifetimes so there are common grounds for the signal processing of both signal types. Furthermore, in both cases, specific signatures may be identified and assigned to biomarkers to improve on specificity and additional molecular identification of compounds.
Both systems have slow image acquisition rates. This can be distressing to patients in a clinical environment and leads to movement artifacts which need to be corrected. Correction algorithms using image registration can account for the movement of organs in DCE-MRI, but have yet to be used by the TPI community. Signal de-noising is more advanced in the THz community, whereas techniques for sparser data acquisition are more developed in the MRI community. Fusion of spatiotemporal information is better addressed by the DCE-MRI community so there are exciting future opportunities to adopt such algorithms by the TPI community. In addition, advances in pharmacokinetic models in DCE-MRI can be transferred to models for movement of water and biomarkers in TPI. There is therefore scope for placing these algorithms in a more unified framework. Furthermore, both scanning systems generate very large datasets which need to be appropriately managed by trained professionals. In addition, both systems can be enhanced by pulse shaping methodologies that can selectively excite molecular systems. Pre-processing using adaptive signal apodization, can maximize the resolving power of the imaging system. The fitting of ARX, ARMAX and subspace identification models can provide a more parsimonious signal representation that can facilitate de-noising and extract de-excitation lifetimes in a very parsimonious manner. In TPI, wavelet de-noising, especially using adaptive wavelets can compress the signals to a few coefficients before introducing them into a classifier. Note that SURE de-noising provides a universally accepted way to perform thresholding of coefficients associated to any of the above data transformations. Modelling using fractional order system identification routines is also an important emergent modality of relevance to both imaging systems.
To achieve accurate detection and diagnosis of tumours, emphasis should be placed on the analysis of spatiotemporal features using a unified perspective. Automatic classification may be performed using either Mahalanobis, SVM or ELM classifiers. In the case of SVM and ELM algorithms, their complex extensions are more useful because features in amplitude and phase or time and frequency respectively may be simultaneously presented to the classifier as different entities. Separately tuned kernels for the real and complex parts of the signal can improve classification accuracy. Extensions to multi-channel kernels using quaternary algebra and tensorial image registration enable additional measurement parameters to be considered such as state of polarization, or additional image morphological features such as tissue folds and striations that account for different degrees of dispersion of the signal in preferential directions across an image to be taken into consideration at the input space of the classifiers. Such approach can lead to improved classification accuracy. Multi-channel classifiers also enable information from other sensing modalities associated to image de-blurring to be taken into consideration in the classification task. Such approach also enables information from images at different time stamps to be also fused before being presented to the classifier.
The observed commonalities in the signal processing of both datasets demonstrate that it is natural to envisage future systems combining both diagnostic capabilities. There is thus a need to integrate the proposed algorithms under a single software environment for data analysis and visualization. Such approach will potentially lead to automated quantitative assessment of disease proliferation in a manner that may be accepted by clinicians.