By Shuyang
Fault-tolerant programming is a programming concept aimed at ensuring the reliability and stability of applications. It incorporates the following measures:
Fault-tolerant programming is an essential programming concept that improves application reliability, stability, and code robustness.
In the development of business technology, it is crucial to design a system architecture that is reusable, scalable, and orchestratable. This directly determines the efficiency of iterating business requirements. At the same time, business and technical support personnel should also adopt a pessimistic perspective. In a distributed environment, occasional fluctuations in HSF services caused by single-point issues are not uncommon. Common hardware and software problems include system fluctuations, single points of failure, service timeouts, service exceptions, middleware fluctuations, network timeouts, and configuration errors. Ignoring these exceptions directly weakens the service's robustness and can potentially impact user experience, lead to user complaints, and even result in system failures. Therefore, when designing solutions and implementing technologies, it is essential to fully consider various failure scenarios and employ defensive programming accordingly.
When calling third-party interfaces, failures often occur. For such cases, we usually follow the logic of retrying the failure or storing the failure. However, retrying is not suitable for all scenarios. For example, invalid parameter verification, whether read and write operations are suitable for retries, and whether data is idempotent are factors to consider. Retrying is applicable when remote calls timeout or the network is suddenly interrupted. Multiple retries can be set to increase the likelihood of a successful call. For the convenience of subsequent troubleshooting and failure rate calculation, recording the number of failures and whether they were successfully stored can facilitate the counting and scheduling of retry tasks.
This article summarizes elegant retry techniques in the face of service failures, such as AOP and CGLIB. It also provides an analysis of the source code of retry tools and components, along with some important notes.
@Test
public Integer sampleRetry(int code) {
System.out.println("sampleRetry,time:" + LocalTime.now());
int times = 0;
while (times < MAX_TIMES) {
try {
postCommentsService.retryableTest(code);
} catch (Exception e) {
times++;
System.out.println("Number of retries" + times);
if (times >= MAX_TIMES) {
//Store the record and retry the subsequent scheduled task.
//do something record...
throw new RuntimeException(e);
}
}
}
System.out.println("sampleRetry,return!");
return null;
}
In certain scenarios, it may not be appropriate or possible for one object to directly reference another. In these instances, a proxy object can serve as a mediator, facilitating communication between the client and the target object. The benefit of employing a proxy is its high compatibility, allowing it to be invoked by any retry method.
public class DynamicProxyTest implements InvocationHandler {
private final Object subject;
public DynamicProxy(Object subject) {
this.subject = subject;
}
/**
* Obtain a dynamic proxy.
*
* @param realSubject proxy object.
*/
public static Object getProxy(Object realSubject) {
// Pass the real object that you want to proxy. The method is called through the real object.
InvocationHandler handler = new DynamicProxy(realSubject);
return Proxy.newProxyInstance(handler.getClass().getClassLoader(),
realSubject.getClass().getInterfaces(), handler);
}
@Override
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
int times = 0;
while (times < MAX_TIMES) {
try {
// When the proxy object calls the method of the real object, it will automatically skip to the invoke method of the handler object associated with the proxy object to call.
return method.invoke(subject, args);
} catch (Exception e) {
times++;
System.out.println("Number of retries" + times);
if (times >= MAX_TIMES) {
//Store the record and retry the subsequent scheduled task.
//do something record...
throw new RuntimeException(e);
}
}
}
return null;
}
}
@Test
public Integer V2Retry(int code) {
RetryableTestServiceImpl realService = new RetryableTestServiceImpl();
RetryableTesterviceImpl proxyService = (RetryableTestServiceImpl) DynamicProxyTest.getProxy(realService);
proxyService.retryableTest(code);
}
CGLIB is a library for generating code that enables the extension of Java classes and implementation of interfaces at runtime. It offers powerful features, high performance, and excellent quality. CGLIB can generate subclasses to act as proxies for target objects, allowing for extension and enhancement without modifying the original class. This technology finds wide application in AOP frameworks, ORM frameworks, caching frameworks, and various other Java applications. By generating bytecode, CGLIB creates proxy classes that deliver high performance.
public class CglibProxyTest implements MethodInterceptor {
@Override
public Object intercept(Object o, Method method, Object[] objects, MethodProxy methodProxy) throws Throwable {
int times = 0;
while (times < MAX_TIMES) {
try {
//Call the parent class method through the proxy subclass.
return methodProxy.invokeSuper(o, objects);
} catch (Exception e) {
times++;
if (times >= MAX_TIMES) {
throw new RuntimeException(e);
}
}
}
return null;
}
/**
* Obtain the proxy class.
* @param clazz class information
* @return result of the proxy class
*/
public Object getProxy(Class clazz){
Enhancer enhancer = new Enhancer();
//The class of the target object.
enhancer.setSuperclass(clazz);
enhancer.setCallback(this);
//Create a subclass instance of the target object class as a proxy through bytecode.
return enhancer.create();
}
}
@Test
public Integer CglibRetry(int code) {
RetryableTestServiceImpl proxyService = (RetryableTestServiceImpl) new CglibProxyTest().getProxy(RetryableTestServiceImpl.class);
proxyService.retryableTest(code);
}
In our daily development, experiencing momentary jitter when calling a third-party HSF service is quite common. To mitigate the impact of call timeouts on your business, you can utilize HSF synchronous retry based on the business and downstream service characteristics. If no specific framework is specified, the HSF interface will not automatically retry when it times out. Within the @HSFConsumer annotation, there is a retries parameter that can be used to set the number of retries on failure. By default, the value of this parameter is 0.
@HSFConsumer(serviceVersion = "1.0.0", serviceGroup = "hsf",clientTimeout = 2000, methodSpecials = {
@ConsumerMethodSpecial(methodName = "methodA", clientTimeout = "100", retries = "2"),
@ConsumerMethodSpecial(methodName = "methodB", clientTimeout = "200", retries = "1")})
private XxxHSFService xxxHSFServiceConsumer;
The following figure shows the process of calling an HSF service:
Retries on an HSF timeout occur in the AsyncToSyncInvocationHandler # invokeType(.):
If the retries parameter is set to be greater than 0, the retry() method will be used to retry, and a retry is only triggered by TimeoutExcenptions.
private RPCResult invokeType(Invocation invocation, InvocationHandler invocationHandler) throws Throwable {
final ConsumerMethodModel consumerMethodModel = invocation.getClientInvocationContext().getMethodModel();
String methodName = consumerMethodModel.getMethodName(invocation.getHsfRequest());
final InvokeMode invokeType = getInvokeType(consumerMethodModel.getMetadata(), methodName);
invocation.setInvokeType(invokeType);
ListenableFuture<RPCResult> future = invocationHandler.invoke(invocation);
if (InvokeMode.SYNC == invokeType) {
if (invocation.getBroadcastFutures() != null && invocation.getBroadcastFutures().size() > 1) {
//broadcast
return broadcast(invocation, future);
} else if (consumerMethodModel.getExecuteTimes() > 1) {
//retry
return retry(invocation, invocationHandler, future, consumerMethodModel.getExecuteTimes());
} else {
//normal
return getRPCResult(invocation, future);
}
} else {
// pseudo response, should be ignored
HSFRequest request = invocation.getHsfRequest();
Object appResponse = null;
if (request.getReturnClass() != null) {
appResponse = ReflectUtils.defaultReturn(request.getReturnClass());
}
HSFResponse hsfResponse = new HSFResponse();
hsfResponse.setAppResponse(appResponse);
RPCResult rpcResult = new RPCResult();
rpcResult.setHsfResponse(hsfResponse);
return rpcResult;
}
}
As you can see from the above code, a retry is only triggered by synchronous calls. If the number of times that the metadata of the consumer method is executed is greater than 1 (consumerMethodModel.getExecuteTimes() > 1), it will switch to the retry method to try again:
private RPCResult retry(Invocation invocation, InvocationHandler invocationHandler,
ListenableFuture<RPCResult> future, int executeTimes) throws Throwable {
int retryTime = 0;
while (true) {
retryTime++;
if (retryTime > 1) {
future = invocationHandler.invoke(invocation);
}
int timeout = -1;
try {
timeout = (int) invocation.getInvokerContext().getTimeout();
RPCResult rpcResult = future.get(timeout, TimeUnit.MILLISECONDS);
return rpcResult;
} catch (ExecutionException e) {
throw new HSFTimeOutException(getErrorLog(e.getMessage()), e);
} catch (TimeoutException e) {
//retry only when timeout
if (retryTime < executeTimes) {
continue;
} else {
throw new HSFTimeOutException(getErrorLog(e.getMessage()), timeout + "", e);
}
} catch (Throwable e) {
throw new HSFException("", e);
}
}
}
The HSF consumer timeout retry principle is based on a simple while loop with try-catch.
Spring Retry, a subproject of the Spring Framework, provides declarative retry support that allows for standardized handling of retries for specific operations. This framework is well-suited for business scenarios that require retries, such as network requests and database access. With Spring Retry, you can use annotations to set up retry policies without writing lengthy code. All configurations are based on annotations, making Spring Retry easy to use and understand.
<dependency>
<groupId>org.springframework.retry</groupId>
<artifactId>spring-retry</artifactId>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-aspects</artifactId>
</dependency>
After the spring-retry jar package is installed, add the @EnableRetry annotation to the startup class of Spring Boot.
@EnableRetry
@SpringBootApplication(scanBasePackages = {"me.ele.camp"},excludeName = {"me.ele.oc.orm.OcOrmAutoConfiguraion"})
@ImportResource({"classpath*:sentinel-tracer.xml"})
public class Application {
public static void main(String[] args) {
System.setProperty("APPID","alsc-info-local-camp");
System.setProperty("project.name","alsc-info-local-camp");
}
@Override
@Retryable(value = BizException.class, maxAttempts = 6)
public Integer retryableTest(Integer code) {
System.out.println("retryableTest,time:" + LocalTime.now());
if (code == 0) {
throw new BizException("Exception", "Exception");
}
BaseResponse<Object> objectBaseResponse = ResponseHandler.serviceFailure(ResponseErrorEnum.UPDATE_COMMENT_FAILURE);
System.out.println("retryableTest,Correct!");
return 200;
}
@Recover
public Integer recover(BizException e) {
System.out.println("Callback method executed!");
//Add logs to the database or call the remaining methods.
return 404;
};
The code shows that the @Retryable annotation is added to the implementation method. @ Retryable has the following parameters that can be configured:
value | A retry is triggered only when a specified exception is thrown. |
include | Similar to the value, this parameter is empty by default. If the exclude parameter is also empty, all exceptions can trigger a retry by default. |
exclude | Specifies the exceptions not to trigger a retry. |
maxAttempts | The maximum number of retries. The default value is 3. |
backoff | The retry delay policy. The @Backoff annotation is used by default, and the default value is 1000 (unit: milliseconds). |
multiplier | Specifies the delay multiple. The default value is 0, which means a one-second delay between retry attempts. If the multiplier is set to 1.5, the first retry is 2 seconds after, the second 3 seconds, and the third 4.5 seconds. |
Spring Retry also offers the @Recover annotation, which is used to handle failures after @Retryable retries fail. If you do not need a callback method, you can simply omit writing a callback method. In this case, when the retries are exhausted and the business criteria are still not met, an exception is thrown. The parameter passed, BizException e, serves as a signal for the callback. This means that when all retries are used up and have failed, we throw this BizException e to trigger the callback method.
Notes:
• When using the @Recover annotation to enable the call method on retry failure, the annotated parameter must be an exception thrown by @Retryable, otherwise, it will not be recognized.
• The return value of the method annotated with @Recover must be the same as that of the method annotated with @Retryable.
• The callback method and the retry method are written in the same implementation class.
• As it is based on AOP, it does not support self-calls within the class.
• You cannot use try-catch within a method. You can only throw an exception to the outside, and the exception must be of the Throwable type.
Sequence diagram of a Spring Retry call:
The basic principle of Spring Retry is to introduce AOP capability through the @EnableRetry annotation. When the Spring container starts, it scans all methods with @Retryable and @CircuitBreaker annotations and generates a PointCut and Advice for them. When a method call occurs, Spring delegates the call to the interceptor RetryOperationsInterceptor, which implements the backoff retry on failure and the degradation recovery method. This design pattern makes the implementation of the retry logic simple and makes full use of the AOP capabilities provided by the Spring framework, thus achieving an efficient and elegant retry mechanism.
While Spring Retry can elegantly implement retries, it still has two unfriendly designs:
Firstly, the retry entity is restricted to a Throwable subclass, which means that retries are designed for catching functional exceptions. However, we may want to rely on a data object entity as a retry entity, but the Spring Retry framework must forcefully cast it to a Throwable subclass.
Secondly, the assertion object at the retry root uses a doWithRetry Exception instance, which does not conform to the return design of normal internal assertions.
Spring Retry advocates annotated retries of methods. The retry logic is executed synchronously, and the failure of retries refers to a Throwable exception. If you are trying to determine whether a retry is needed based on the state of the returned value, you may have to judge the returned value by yourself and then explicitly throw an exception.
Guava Retrying is a library based on the retry mechanism of Guava, a core Java library developed by Google. It provides a general-purpose method for retrying arbitrary Java code with specific stop, retry, and exception-handling capabilities that are enhanced by Guava's predicate matching. This library supports a variety of retry policies, such as specifying the number and wait interval of retries. Additionally, it supports predicate matching to decide whether the retry should be performed and what to do during the retry. The most important feature of Guava Retrying is that it can flexibly integrate with other Guava libraries, making it easy to use.
<dependency>
<groupId>com.github.rholder</groupId>
<artifactId>guava-retrying</artifactId>
<version>2.0.0</version>
</dependency>
public static void main(String[] args) {
Callable<Boolean> callable = new Callable<Boolean>() {
@Override
public Boolean call() throws Exception {
// do something useful here
log.info("call...");
throw new RuntimeException();
}
};
Retryer<Boolean> retryer = RetryerBuilder.<Boolean>newBuilder()
//retryIf Retry conditions
.retryIfException()
.retryIfRuntimeException()
.retryIfExceptionOfType(Exception.class)
.retryIfException(Predicates.equalTo(new Exception()))
.retryIfResult(Predicates.equalTo(false))
//Wait policy: Each request is sent at an interval of 1s.
.withWaitStrategy(WaitStrategies.fixedWait(1, TimeUnit.SECONDS))
//Stop policy: 6 attempts
.withStopStrategy(StopStrategies.stopAfterAttempt(6))
//Time limit: A request cannot exceed 2s.
.withAttemptTimeLimiter(
AttemptTimeLimiters.fixedTimeLimit(2, TimeUnit.SECONDS))
//Register a custom listener (you can implement the listener after failure).
.withRetryListener(new MyRetryListener()).build();
try {
retryer.call(callable);
} catch (Exception ee) {
ee.printStackTrace();
}
}
If you require additional processing actions to occur when a retry is attempted, such as sending an alert email, then you can use RetryListener. After each retry, Guava Retrying automatically calls back your registered listener. You can register multiple RetryListeners, and Guava Retrying will sequentially call them in the order of registration.
public class MyRetryListener implements RetryListener {
@Override
public <V> void onRetry(Attempt<V> attempt) {
// The number of retries.
System.out.print("[retry]time=" + attempt.getAttemptNumber());
// The delay from the first retry.
System.out.print(",delay=" + attempt.getDelaySinceFirstAttempt());
// Retry result: terminated with exceptions or returned normally.
System.out.print(",hasException=" + attempt.hasException());
System.out.print(",hasResult=" + attempt.hasResult());
// The cause of the exception.
if (attempt.hasException()) {
System.out.print(",causeBy=" + attempt.getExceptionCause().toString());
// do something useful here
} else {
// The normally returned result.
System.out.print(",result=" + attempt.getResult());
}
System.out.println();
}
}
RetryerBuilder is a factory builder that allows customization of retry sources and supports multiple retry sources. You can configure the number of retries, retry timeout, and waiting interval. Additionally, you can create a Retryer instance.
The retry source of RetryerBuilder supports exception objects and custom assertion objects. It can simultaneously support multiple objects and is compatible with them.
• retryIfException: A retry is triggered when a runtime exception or a checked exception is thrown. It will not be triggered when an error is thrown.
• retryIfRuntimeException: A retry is triggered only when a runtime exception is thrown. It will not be triggered when a checked exception or error is thrown.
• retryIfExceptionOfType: A retry is triggered only when specific exceptions occur. For example, runtime exceptions such as NullPointerException and IllegalStateException, as well as custom errors.
• RetryIfResult: A retry is triggered only when a specified Callable method returns a value.
StopStrategy: Stop the retry policy. The following methods are provided:
StopAfterDelayStrategy | Set a maximum allowed execution time. For example, set a maximum execution time of 10s. Then, regardless of the number of task executions, as long as the retry time exceeds the maximum time, the task is terminated and a retry exception is returned. |
NeverStopStrategy | It is used in situations where you need to perform round robin until the expected result is returned. |
StopAfterAttemptStrategy | Set the maximum number of retries. If the maximum number of retries is exceeded, the retries are stopped and a retry exception is returned. |
WaitStrategy | The wait interval strategy. It can control the time interval. |
FixedWaitStrategy | The fixed wait interval strategy. |
RandomWaitStrategy | The random wait interval strategy. You can provide a minimum and maximum interval, and the wait interval is a random value within this range. |
IncrementingWaitStrategy | The incremental wait interval strategy. You can provide an initial value and step size, and the wait interval increases as the number of retries increases. |
ExponentialWaitStrategy | The exponential wait interval strategy. |
FibonacciWaitStrategy | The wait interval strategy. |
ExceptionWaitStrategy | The exception wait interval strategy. |
CompositeWaitStrategy | The composite wait interval strategy. |
The Guava Retryer tool is similar to Spring Retry in that it wraps normal retry logic by defining the role of the retryer. However, Guava Retryer has a more advanced strategy definition. It not only supports setting the number of retries and retry frequency control, but also allows the definition of multiple exceptions or custom objects as retry sources, providing more flexibility. This makes Guava Retryer suitable for a wider range of business scenarios, such as network requests and database access. Additionally, Guava Retryer is highly extensible and can easily be integrated with other Guava libraries.
Both Spring Retry and Guava Retry are thread-safe retry tools that support retry logic in concurrent business scenarios and ensure the correctness of retries. These tools support the retry wait interval, differentiated retry strategy, and retry timeout, which further enhance the effectiveness of retries and the stability of the process.
Furthermore, both Spring Retry and Guava Retryer utilize the command design pattern to delegate the retry object and complete the corresponding logical operation. They both internally encapsulate the retry logic. This design pattern makes it easy to extend and modify the retry logic, while also improving code reusability.
In certain functional logics, there are scenarios with unstable dependencies. In such cases, the retry mechanism is needed to obtain the desired result or attempt to re-execute the logic instead of immediately terminating it. For example, the retry mechanism can be used in scenarios such as remote interface access, data loading access, and data upload verification.
Different exception scenarios require different retry methods. It is also important to decouple the normal logic from the retry logic. When setting up the retry strategy, various factors need to be considered according to the situation. For instance, when is the appropriate time to retry? Should it be done synchronously with blocking or asynchronously with a delay? Does it have the ability to fail fast with one click? Additionally, the impact on user experience when failure occurs without retry should be carefully considered. When setting the timeout, retry strategy, retry scenarios, and retry times, it is crucial to take these factors into account.
This article only covers a small part of the retry mechanism. In actual applications, an appropriate failure retry scheme should be adopted based on the specific situation.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
1,071 posts | 263 followers
FollowAlibaba Cloud Native - June 12, 2024
Alibaba Clouder - July 9, 2018
Alibaba Cloud Storage - June 19, 2019
Alibaba Cloud Native Community - September 19, 2023
Alibaba Cloud Community - December 29, 2021
Alibaba Cloud Community - May 1, 2024
1,071 posts | 263 followers
FollowA quotation service that establishes stable, high-quality connections to exchanges all around the world at ultra-low latency.
Learn MoreMore Posts by Alibaba Cloud Community